ndierckx / NOVOPlasty

NOVOPlasty - The organelle assembler and heteroplasmy caller
Other
176 stars 63 forks source link

is it possible to run NOVOPlasty on pooled individuals to increase coverage? #169

Closed TeresaPegan closed 3 years ago

TeresaPegan commented 3 years ago

Hello! I have some low-coverage WGS data that was sequenced to about 3x for another project. I am looking into whether I can also use this 3x data to assemble a mitochondrial genome using NOVOPlasty. I set the kmer to 21 as suggested in the documentation. So far, I have gotten some nice big contigs, but I don't think I'm getting the complete genome and some samples have worked better than others.

I am wondering whether 3x coverage is a bit too low for NOVOPlasty to work very well, and if that is the case, would it be possible to try pooling samples together in order to increase coverage depth for the sake of getting a complete draft reference? How would NOVOPlasty handle the SNPs that would be in the dataset as a result of it coming from multiple individuals?

For some context about my sampling, I have the potential to pool about 10 samples that all come from one place, and about 40 samples that all belong to the same subspecies (but come from more disparate places and probably have more heterogeneity in them than the set of 10 from one place). My study organism is a non-model bird species.

Thanks! -Teresa

ndierckx commented 3 years ago

3x is not much.. Then there will be regions with 1x too and that won't work... It is no problem in pooling them, if it is the same species, there will only be small differences Those differences will represented by ambiguous nucleotides in the assembly (N,W,R,...) This are the codes: http://www.bioinformatics.org/sms2/iupac.html

TeresaPegan commented 3 years ago

Thank you, I will give this a try! (Also just in case it wasn't clear, my whole genome was sequenced to 3x, but the mitochondrial reads have average depth in the hundreds -- but nonetheless, I think there are some gaps where coverage is lower)