Rosemeis / pcangsd

Framework for analyzing low depth NGS data in heterogeneous populations using PCA.
GNU General Public License v3.0
46 stars 11 forks source link

How to perform a HWE test? #65

Closed beausoleilmo closed 1 year ago

beausoleilmo commented 1 year ago

In the README, it says that PCAngsd can "PCAngsd can perform the following analyses: HWE test".

But I don't see an option that can do that.

So far I have

pcangsd -b MajorMinor.beagle.gz \
        -o hwe/gl.pcangsd \
        --inbreedSites \
        -t 5

The in R

library(RcppCNPy)
old.path = getwd()
setwd("~/Desktop/outserver/hwe/") # setting the path to the npy file 
i.sites <- RcppCNPy::npyLoad(filename = "gl.pcangsd.inbreed.sites.npy")
lrt.sites <- RcppCNPy::npyLoad(filename = "gl.pcangsd.lrt.sites.npy")
setwd(old.path) # Getting back to where we were 
pval = pchisq(lrt.sites,1, lower.tail=FALSE)

# plots 
hist(i.sites, breaks = 250)

plot(i.sites~c(1:length(i.sites)), 
     main = "gl.pcangsd.inbreed.sites.npy",
     pch = ".", 
     col = ifelse(-log10(pval)>200,"red","black"))
abline(h = 0, lty = 2, col = "gray50", lwd = 3)

plot(-log10(pval)~c(1:length(pval)), pch = ".")

So:

Is everything correct here? Where are the estimates of observed vs expected heterozygosity?

Rosemeis commented 1 year ago

Hi,

Yes everything is correct. For your last point, the estimates of observed and expected heterozygosity is not outputted unfortunately and only summarized by F. Here F is 1 - obs(het)/exp(het) accounting for population structure using individual allele frequencies as defined in https://doi.org/10.1111/1755-0998.13019.

Best, Jonas

beausoleilmo commented 1 year ago

OK thanks!