ExaScience / elprep

elPrep: a high-performance tool for analyzing sequence alignment/map files in sequencing pipelines.
Other
287 stars 40 forks source link

elprep sfm run failed due to insufficient RAM #37

Closed babicjovana closed 4 years ago

babicjovana commented 4 years ago

Hi, Would it be possible to make elPrep sfm more stable by not failing it due to insufficient RAM memory, but to maybe just prolong its runtime? It is really high resource demanding, e.g. elprep has a failed run with a 150x bam file of 113GB on an instance with 92vCPUs and 192GB RAM. Thank you.

caherzee commented 4 years ago

Hi, For your reference, for a 50x WGS bam files of +-110GB we require around 192GB of RAM to run a 4-step pipeline (sorting, duplicate marking, base quality score recalibration and application). Maybe you can run your data on a bigger node to figure out how much it really needs?

It would in principle be possible to change elPrep so RAM use can be limited, but it would be a lot of work. We prioritise our development based on the funding we get for particular functionality. If you would like to fund such a feature, feel free to contact us.

We also welcome code contributions and are willing to review them.

Thanks.