Closed Ojami closed 1 year ago
Hi Oveis,
1-For PGEN format, REGENIE uses Mach Rsq INFO score whereas for BGEN format it uses the IMPUTE INFO score (see here for details). IMPUTE INFO score requires genotype probabilities which is why we don't use it for PGEN which only contains dosages. 2- There is no difference in file format as long as you have the males coded as 0/2 for non-PAR X.
Cheers, Joelle
Hi Joelle,
In the documentation, it has been suggested that users should turn to pgen/bed files for analysing sex chromosomes:
I'm using UK Biobank imputation BGEN files, and realized REGENIE works just fine even with BGEN files (all coded as diploids already?) and stats are same as when I use PGEN file (generated from the BGEN file), except for INFO score which is > 1 for PGEN.
PGEN:
BGEN:
So, my questions are: 1- Why do INFO scores are different? This INFO score is different than the one originally relased by UKBB based QCTOOL (the difference between REGENIE and QCTOOL INFO scores when using autosomal chromosomes is much smaller). 2- What's the different between using BGEN and PGEN here? Aren't they the same at least for UKBB (seems males are already diploids).
Thanks! Oveis