millanek / Dsuite

Fast calculation of Patterson's D (ABBA-BABA) and the f4-ratio statistics across many populations/species
160 stars 26 forks source link

Three questions about the interpretation of Dsuite results #59

Closed RezaFahi closed 1 year ago

RezaFahi commented 1 year ago

Hi I have three simple questions about the results of Dsuite

  1. When based on the results, we realized that there is gene flow between P2 and P3 (Z-score > +3; p-value < 0.05 for example). How can we determine the direction of this gene flow? Can we reach this answer based on the results of the Dsuite or should we get help from other programs?

  2. In the output of Dtrios, does the f4-ratio range from zero to one, and for example, the number 0.035 mean that there is 3.5 percent admixture between P2 and P3?

  3. why the f4-ratio and D static are different for the two below trios

    P1  P2  P3  Dstatistic  Z-score p-value f4-ratio    BBAA    ABBA    BABA
    Bc1 Bc2 Dro2    0.191536    6.08768 1.14559e-09 0.0297122   2.21398e+06 210055  142524
    Dro1    Dro2    Bc2 0.313128    11.5109 2.3e-16 0.0289916   2.31764e+06 135908  71090.7

In both cases, isn't the presence of introgression between Bc2 and Dro2 checked?

millanek commented 1 year ago

Hi @RezaFahi

  1. Dsuite cannot tell you about the direction of gene-flow. I suggest you look into Dfoil (https://academic.oup.com/sysbio/article/64/4/651/1650669)
  2. Yes, the f4-ratio estimates the admixture proportion.
  3. Each calculation makes a comparison to the P1 species and the P1 species differ between the two trios (Bc1 and Dro1). The reason that this results in different values for the statistic can be simply sampling variation, but there can also be a real biological reason (e.g. the divergence time between Bc1 and Bc2 is different from the divergence time between Dro1 and Dro2.

Hope this helps Milan

RezaFahi commented 1 year ago

Hi @RezaFahi

1. Dsuite cannot tell you about the direction of gene-flow. I suggest you look into Dfoil (https://academic.oup.com/sysbio/article/64/4/651/1650669)

2. Yes, the f4-ratio estimates the admixture proportion.

3. Each calculation makes a comparison to the P1 species and the P1 species differ between the two trios (Bc1 and Dro1).   The reason that this results in different values for the statistic can be simply sampling variation, but there can also be a real biological reason (e.g. the divergence time between Bc1 and Bc2 is different from the divergence time between Dro1 and Dro2.

Hope this helps Milan Thanks dear @millanek