GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
104 stars 19 forks source link

Memory usage exploding #150

Closed mnsmar closed 2 months ago

mnsmar commented 8 months ago

Hi,

The memory usage is exploding during m6anet data_prep and the run is killed. Any idea what's causing this? Here is the command I use:

m6anet dataprep --eventalign eventalign.txt --out_dir ouput --readcount_max 1000000 --n_processes 10
image
yuukiiwa commented 7 months ago

Hi @mnsmar,

This memory explosion was likely due to --readcount_max 10000000. You can try using the default --readcount_max, which is 1000.

Thanks!

Best wishes, Yuk Kei

mnsmar commented 7 months ago

Thanks for the reply @yuukiiwa. The reason we use a high --readcount_max is because the majority of our reads are on a very specific locus. As far as I understand, when we use a lower --readcount_max these loci are skipped. Is there a way to avoid skipping these regions but randomly use up to --readcount_max reads in a locus?

yuukiiwa commented 7 months ago

Hi @mnsmar,

Thanks for the explanation of your data!

Given that most reads are of the same sites, you can consider splitting them into smaller eventalign.txt files and run m6anet dataprep separately on them.

Thanks!

Best wishes, Yuk Kei