HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
176 stars 32 forks source link

Simulating exomes #197

Closed cilliannolan closed 11 months ago

cilliannolan commented 3 years ago

Hi there,

I am hoping to use InSilicoSeq to simulate exomes.

I have been able to simulate them, however after processing ~60% of the reads in my BAMs are removed as duplicates by Picard. Do you know what might be causing this issue?

HadrienG commented 3 years ago

Hi!

InSilicoSeq was not designed with exome sequencing in mind. My guess would be that your coding sequences are very small compared to microbial genomes, and therefore InSilicoSeq is oversequencing? How many reads are you simulating and what's your expected coverage?

Best, Hadrien.