nh13 / DWGSIM

Whole Genome Simulator for Next-Generation Sequencing
GNU General Public License v2.0
92 stars 36 forks source link

-H option #45

Closed IleaHeft closed 6 years ago

IleaHeft commented 7 years ago

Hello,

I am trying to run dwgsim. I tried running dwgsim with and without the -H option and the same coverage levels are produced. Can you please provide clarification on the correct way to utilize the -H option? Below are examples of what I am running.

Without -H: dwgsim -1 150 -2 150 -e 0.0026 -E 0.004 -C 30.0 -c 0 ~/my-simulation/domain-fasta/NBPF1_CON1_1.fa NBPF1_CON1_1_30

With -H: dwgsim -H -1 150 -2 150 -e 0.0026 -E 0.004 -C 30.0 -c 0 ~/my-simulation/domain-fasta/NBPF1_CON1_1.fa NBPF1_CON1_1_30

Additionally, in the default diploid mode, I expected that with -C 30, that the mean coverage would be 30, however, the mean coverage that is being generated is only 15. Any insights would be appreciated.

Thank you,

Ilea Heft ilea.heft@ucdenver.edu

nh13 commented 7 years ago

Thanks for logging your issue. I'll take a look over the next week or so as i am currently away.

nh13 commented 7 years ago

@IleaHeft the "-H" option will cause all reads to come from the same copy of the genome (haploid), so -C 30.0 will mean that you have 30x coverage of that haploid copy. For diploid genomes, you will have 30x combined coverage (across both copies), so on average, 15x from each copy respectively. Let me know if that's what you see (or not).