uqrmaie1 / admixtools

https://uqrmaie1.github.io/admixtools
71 stars 14 forks source link

large difference of f3 result between admixtools and admixtools2 #60

Open hungweichen0327 opened 7 months ago

hungweichen0327 commented 7 months ago

Dear @uqrmaie1 and community,

I would like to ask the outgroup f3 value between admixtools and admixtools2. I used the same dataset to run admixtools and admixtools2:

admixtools:

The script-

qp3Pop -p par.784_outgroup_f3 > outgroup_f3_k8_Q0.7

admixtools2:

The script in R library (admixtools)-

# get f2 block
f2_blocks = f2_from_precomp(my_f2_dir)

# set population for 𝑓3(𝐴;𝐵,𝐶)
pop1 = "outgroup"
pop2 = c("SubAUw", "SubAUe", "SubAS_SA", "SubAS_SEA", "SubIDN", "SubAF", "Rad1", "Rad2")
pop3 = c("SubAUw", "SubAUe", "SubAS_SA", "SubAS_SEA", "SubIDN", "SubAF", "Rad1", "Rad2")

# run f3 statistics
outgroup_f3 = qp3pop(f2_blocks, pop1, pop2, pop3)

However, the output value of f3 is quite different. For example: admixtools:

 Source 1             Source 2               Target           f_3       std. err           Z    SNPs
 result:                SubAUe               SubAUw             outgroup      3.264038       0.112766      28.945 14028730

admixtools2:

pop1    pop2    pop3    est     se      z       p
outgroup        SubAUe  SubAUw  0.22800450361123        0.00370301318335037     61.5726956189065        0

The f3 values are 3.26 in admixtools vs 0.22 in admixtools2.

Thank you for the help.

uqrmaie1 commented 6 months ago

There are number of parameters that can affect f3-statistics. The default parameters in Admixtools 2 mostly match those of Admixtools, but there are some differences:

Here is an example for how to compute f3-statistics in 3 different settings in Admixtools 2 and Admixtools (via wrapper functions), so that the results are almost identical:

library(admixtools)
library(tidyverse)

prefix = '~/Downloads/v42.1_small'
qp3pop = '~/Downloads/AdmixTools/bin/qp3Pop'

pops = dimnames(example_f2_blocks)[[1]]

f2b = f2_from_geno(prefix, pops = pops[1:3], poly_only = F)

# target diploid
# outgroupmode NO (default with geno prefix, not possible with precomputed f2)
qp3pop_wrapper(prefix, pops[2], pops[3], pops[1], bin = qp3pop, outgroupmode = F)
f3(prefix, pops[1], pops[2], pops[3]) # est, se, z match well; n is higher
f3(prefix, pops[1], pops[2], pops[3], poly_only = T) # est, se, z match less well; n is identical

# outgroupmode YES
qp3pop_wrapper(prefix, pops[2], pops[3], pops[1], bin = qp3pop, outgroupmode = T)
f3(prefix, pops[1], pops[2], pops[3], outgroupmode = T) # est, se, z match (except *1000)
f3(f2b, pops[1], pops[2], pops[3])

# target pseudodiploid
qp3pop_wrapper(prefix, pops[1], pops[3], pops[2], bin = qp3pop, outgroupmode = T)
f3(prefix, pops[2], pops[1], pops[3], outgroupmode = T, apply_corr = F) # est, se, z match (except  *1000)

Please let me know if you still aren't able to get matching results from Admixtools and Admixtools 2!