shuzhao-li-lab / asari

asari, metabolomics data preprocessing
Other
38 stars 9 forks source link

Parallelize RT Alignment #70

Closed jmmitc06 closed 1 week ago

shuzhao-li commented 3 months ago

There should be a new module on RT alignment in ver 2, to 1) decide sample clusters, assuming that QC samples may not have the same properties as study samples. 2) enhance performance by parallelization. 3) accommodate user supplied internal standards and landmarks. 4) be compatible with GC data.

jmmitc06 commented 1 week ago

The actual alignment is quite fast, its loading the data from disk that is slow which has been fixed largely with the newest versions.

Point 1 can be done with clustering. Point 2 is largely handled by the sample caching Point 3 can be pulled in from one of Steve's dev branches Point 4 is work in progress.

jmmitc06 commented 1 week ago

Going to close this issue since it was specifically about parallelization. The newest addition on the compress branch can preload files largely alleviating the bottleneck at both retention time alignment and mass grid construction.