broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
338 stars 60 forks source link

Deeply sequenced sample -- can it be downsampled before running pilon to improve running times? #79

Closed lfaller closed 5 years ago

lfaller commented 5 years ago

Hello,

I have some very deeply sequenced samples. In order to improve processing times I usually downsample them using bbnorm to about 100x coverage.

The wiki states that "... total sequence coverage should be 50x or greater, though deeper total coverage of >100x is beneficial".

Would you recommend ever downsampling? Is there an upper limit (i.e. 1000x) where extra coverage does not add anything else?

Thanks for any advice! ~Lina

w1bw commented 5 years ago

Hi Lina, I'm finally catching up on Pilon support. Thanks for writing!

Extra coverage shouldn't hurt, except in terms of compute time. Generally, I find there's not much to be gained beyond 200x or so with Illumina sequencing.

Good luck!