Open 23andme-jaredo opened 1 year ago
As with the plink .bed format, haploid vs. diploid is not directly encoded in the .pgen. Instead, plink and plink2 divide the encoded values by two when the .bim/.pvar (and on chrX, .fam/.psam) file indicates that we're dealing with haploid data.
hmmm so I am a bit confused. I have imputed data converted from bcf via:
plink2 --bcf $bcf dosage=HDS --make-pfile
and I can see that the two haploid dosages per individual are stored because I can recover them via:
plink2 --pfile plink2 --export vcf bgz vcf-dosage=HDS
so I am try to extract those HDS values via pgenlib
Maybe I wasn't clear that I meant imputed haploid/phased probabilities, not hard genotypes.
Oh, sorry, I thought you were referring to e.g. chrX/chrY/chrM.
The PgrGetDp() function in pgenlib_read.h is the simplest one that can return biallelic phased dosages.
Thanks! We'll try exposing that in python.
Is it possible to read haploid dosages with
pgenlib.PgenReader
?thanks,
Jared