kcleal / dysgu

Toolkit for calling structural variants using short or long reads
MIT License
88 stars 10 forks source link

Can dysgu be used in haploid genome? #57

Closed wushyer closed 1 year ago

wushyer commented 1 year ago

Hi, Can I use dysgu in haploid genome resequencing PE reads? Thanks Best Shuangyang

kcleal commented 1 year ago

Hi @wushyer, Yes, it should work even better than diploid.

wushyer commented 1 year ago

Hi @wushyer, Yes, it should work even better than diploid.

Thanks for the quick relpy. That's very good to know!

Best, Shuangyang

wushyer commented 1 year ago

Hi Kez,

If I understand the tools correctly, I only need to set '--diploid FALSE' is enough for non-diploid species. Is that right? Thanks.

By the way, I have run it and get the genotype as '0/1', how do I understand this label? My species is haploid genome.

Best Shuangyang

kcleal commented 1 year ago

Oh I see what you mean. The diploid false option is really for polyploid genomes or non-colonal samples (cancer samples etc) rather than haploid. Using the diplod false option on a haploid genome will probably result in worse performance as some features will not be used in the machine learning classifier. I recommend using diploid true, even for a haploid genome. With regard to the 0/1 genotype, this can happen at some sites if there are ambiguous mappings. There is no option to force a haploid genotype call. If you know your sample is haploid, you can ignore these or change them to haploid genotype

wushyer commented 1 year ago

Oh I see what you mean. The diploid false option is really for polyploid genomes or non-colonal samples (cancer samples etc) rather than haploid. Using the diplod false option on a haploid genome will probably result in worse performance as some features will not be used in the machine learning classifier. I recommend using diploid true, even for a haploid genome. With regard to the 0/1 genotype, this can happen at some sites if there are ambiguous mappings. There is no option to force a haploid genotype call. If you know your sample is haploid, you can ignore these or change them to haploid genotype

Thanks Kez! Best.