Illumina / akt

Ancestry and Kinship Tools
GNU General Public License v3.0
69 stars 13 forks source link

AKT PCA....Bad genotypes #41

Open gaochengPRC opened 2 years ago

gaochengPRC commented 2 years ago

Hi,

When I used akt pca, I got such message


MAF lower bound: 0 Thin: 1 Number principle components: 20 Reading data... 984 samples Bad genotypes at 1:758351


Ca you explain to me what does "Bad genotype" mean in the message? How should I handle with this issue?

Thank you!

jaredo commented 2 years ago

It could not parse the genotype filed at that position. What does bcftools view -H -r 1:758351 look like in your data?

gaochengPRC commented 2 years ago

1 758351 1:693731 A G . PASS AF=0.12008;MAF=0.12008;R2=0.57058;AN=6762;AC=1299 DS 0.037 0.766 0.036 0.078 ......

It looks normal, so I am very confused!!!

jaredo commented 2 years ago

The software requires the FORMAT/GT field to be present, it cannot use imputed dosages (FORMAT/DS).

jaredo commented 2 years ago

You could convert your dosages to hard genotypes but I think you would be better off running PCA on the microarray genotypes rather than imputed ones.

gaochengPRC commented 2 years ago

One way I am doing is to convert vcf dosage file to plink format, but it will lose some information because it will round dosage values to 0,1,2. I do not know whether there is a better way..... Thank you so much!

jaredo commented 2 years ago

PLINK1.9 also has fast randomized PCA routines these days and may be a better choice.