nanoporetech / modkit

A bioinformatics tool for working with modified bases
https://nanoporetech.com/
Other
116 stars 6 forks source link

num-reads not working #179

Open ShokodkoMariia opened 1 month ago

ShokodkoMariia commented 1 month ago

Hello! I have memory of 197G and it's not enough for summary of whole bam and even while trying to work with chromosomes there is the same problem with summary of seven most weighting ones. I've tried this with 15cpu altering interval-size, sampling-frac or num-reads modkit summary \ -t ${CPU} \ --log-filepath ${OUT_DIR}/modkit_debug.log \ --interval-size 10000 \ --sampling-frac 0.1 \ --include-bed ${OUT_DIR}/${TARGET_BED} \ ${BAM_DIR}/${BAMFILE} > ${OUT_DIR}/modkit_summary.txt And as result for one of chromosomes: image I saw that 119665 was used and thought that this could had caused the memorage problem while working with other chromosomes and the whole bam. And then altering sampling-frac or num-reads I saw that the figure of ~120000 reads stays the same every time (only when I set it to 0 number of reads was 0). Is there a problem with my usage of num-reads/sampling-frac?

ArtRand commented 1 month ago

Hello @ShokodkoMariia could you tell me what version of modkit you're using?

ShokodkoMariia commented 1 month ago

i’ve tried both v0.2.7 and v0.2.8-rc1

ArtRand commented 1 month ago

Hello @ShokodkoMariia,

Sorry for the delay. Could you give me the exact commands you're using and whether or not they end up consuming excessive memory? Also could you tell me roughly what the BED file you're using is like (i.e. how many regions, how big are they).