nanoporetech / tombo

Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.
Other
231 stars 54 forks source link

memory control of tombo #284

Open forrwill opened 4 years ago

forrwill commented 4 years ago

I am using combo for human data, the memory of resquiggle and detect_modification process is really large, up to 500G memories,Is there any parameters to set the max memory? thank you!

marcus1487 commented 4 years ago

There are no parameters to specifically limit memory usage, but there are a couple of options to lower the memory required for this command.

High memory usage at the detect modifications step usually occurs due to very high coverage. Generally there is a limit to the added benefit of added coverage at some point. If results can be achieved with less coverage the tombo filter level_coverage command can be used (docs here).

The other option to more directly limit the memory usage of the detect_modifications command is to use a smaller value for the --multiprocess-region-size (docs here). This will lower the number of reference bases processed by each process and thus lower the memory required.