lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
481 stars 133 forks source link

Randomized runs of Biostar154220.html #212

Open tuannguyen8390 opened 2 years ago

tuannguyen8390 commented 2 years ago

Hi there,

I'm running Biostar144220 and getting great results from it. I'm able to knock off mapped reads from 60X -> 20/10/5X successfully. However, when I run these multiple times, I will always receive the same set of reads - I believe this is due to the way the program was structured (If reads have the same group name + over the cap limit -> discard). Is there any way to add some sort of randomization in so that lower coverage can have a different set of reads every time I do the subsampling?

Many thanks,

Tuan

lindenb commented 2 years ago

Hi, a quick answer: no there is no way to use randomization at this step. You can always try to use samtools view with option '-s' try another way to downsample your reads.