RitchieLab / LOKI

0 stars 0 forks source link

UCSC processing takes a long time and should look if we can do this in parallel #7

Closed van-truong closed 2 months ago

van-truong commented 3 months ago

We looked at Brandon's timed log output in July 2024

Saw that UCSC ECR takes 8 hrs to process & compact/compile into SQLite schema Total LOKI processing time for all sources is 12 hrs

Li agrees that working on this will be the biggest win

van-truong commented 3 months ago

Brandon's comment pasted here:

Going to need to find a better method for SELECT last_insert_rowid() in sqllite because it can't be trusted in threaded operations.

Docs: https://www.sqlite.org/c3ref/last_insert_rowid.html

XueqiongLi commented 2 months ago

data processing for ucsc is parallelized by chromosome like dbsnp now

XueqiongLi commented 2 months ago

data processing for ucsc is parallelized by chromosome like dbsnp now