millanek / Dsuite

Fast calculation of Patterson's D (ABBA-BABA) and the f4-ratio statistics across many populations/species
160 stars 26 forks source link

Dsuite for triploids #74

Closed sstwins21 closed 1 year ago

sstwins21 commented 1 year ago

Hi,

This might not be an software issue, but would it be okay if I could ask for theoretical question regarding running Dsuite?

I want to run Dsuite for taxa that are polyploid consisting diploid and triploid. Since, Dsuite can only work on bi-allelic sites, I have removed all the tri-allelic sites and convert triploid alleles to have only diploid by converting their heterozygous genotype from 0/1/1 or 0/0/1 to 0/1 in VCF. However ABBA BABA relies on allele frequencies which normally should be 50:50 if the sample is diploid. Hence would it be okay if I divide the 2 copy allele's frequency by half in triploid in the AD field? Would their be any vacillation in terms of theory?

Sorry I realized that allele frequency is not the AD in VCF file but rather allele frequency in population. Kind regards, Shane