Mayrlab / scUTRquant

Bioinformatics pipeline for single-cell 3' UTR isoform quantification
https://Mayrlab.github.io/scUTRquant
GNU General Public License v3.0
14 stars 3 forks source link

Scaling to large datasets #55

Closed mfansler closed 1 year ago

mfansler commented 1 year ago

The MTX to SCE conversion is memory intensive. On a recent run, I encountered 122 GB memory usage (requiring two fail-retry cycles) on this step (1.4T raw data). We should look into DelayedArray options. It may be feasible to convert each sample in series, writing out to a common HDF5 (or similar) file.