Closed biozzq closed 3 years ago
Since EigenGWAS is an unsupervised method, it has no idea how many groups inside. EigenGWAS learns from the data itself how to group it best.
It's better to try it first, then see whether the result make sense or not. My suggestion.
Dear @gc5k
Thank you. I tried on my data. The command and logs are as following. The Lambda GC is 126.5, indicating substantial population stratification, however, no significant sites were identified after GC correction. How do you think about my data, can i use the raw p value to identify loci under selection?
java -jar -Xms10G -Xmx100G gear.jar eigengwas --bfile plink --ev 2 --out out
[INFO] 542 individuals were matched for analysis.
[INFO]
[INFO] Calculating locus statistics with 1 threads.
[INFO] Average MAF: 0.2475
[INFO] Average variance: 0.3814
[INFO] Average missing rate: 0.0070
[INFO] Calculating eGWAS with 1 thread.
[INFO] Median of p values is 3.241851231905457E-14
[INFO] Lambda GC is: 126.59680460266985
[INFO] 542 individuals were matched for analysis.
[INFO]
[INFO] Calculating locus statistics with 1 threads.
[INFO] Average MAF: 0.2475
[INFO] Average variance: 0.3814
[INFO] Average missing rate: 0.0070
[INFO] Calculating eGWAS with 1 thread.
[INFO] Median of p values is 1.4095455993179407E-4
[INFO] Lambda GC is: 31.85171494145232
Sincerely, Zheng zhuqing
You are doing well with the provided line and the output seems okay. The sample, my guess, is very unlikely a nature population but a breeding population that has been highly selected. If you draw a Manhattan plot, the signals may, or should, be around the p-value threshold but not yet exceeds it .
It is overkilling of the signals as experienced by certain kind of data, such as the one you have. We have already fixed the problem but haven't published the algorithm yet.
Dear @gc5k
All the Pgc values are nearly 1, do you mean that I should using the raw-p value to generate Manhattan plot? This population is generated from worldwide, and population structure analysis indicated these individuals could be clustered by their geographic distribution except for those mixture ones. And this population includes both wild and domestic samples.
Thanks for your efforts, wishes to see your new algorithm.
Sincerely, Zheng zhuqing
You may send over *.egwas file and we can make a refined correction for you.
chen.guobo@foxmail.com.
Dear @gc5k
Thank you. As the size of *1.egwas file is about 700M after compression, I have uploaded the file to google driver and you can download it using following link. https://drive.google.com/file/d/1IvILrv6UPTW3E0pB2UVGrWXH69s3L4mH/view?usp=sharing
Sincerely, Zheng zhuqing
We will take a look. Hold on.
Dear @gc5k
Thank you. As the size of *1.egwas file is about 700M after compression, I have uploaded the file to google driver and you can download it using following link. https://drive.google.com/file/d/1IvILrv6UPTW3E0pB2UVGrWXH69s3L4mH/view?usp=sharing
Sincerely, Zheng zhuqing
What's your email?
Dear @gc5k
Here is my email address, zzq1207@126.com
Thank you.
Sincerely, Zheng zhuqing
Dear all,
I wonder if I can run EigenGWAS when I have more than ten groups. Thank you.
Best wishes, Zheng zhuqing