broadinstitute / kage-lite-development

0 stars 0 forks source link

Allele frequency calculation #30

Open danielben-isvy opened 6 months ago

danielben-isvy commented 6 months ago

In the single-sample VCFs produced by the KAGE pipeline, the AF info field is calculated based on genotype dosage (DS) rather than the called genotype (GT), resulting in single-sample AFs that do not match the expected 0, 0.5, or 1.

samuelklee commented 5 months ago

All INFO fields in the final case VCFs are actually added by the GLIMPSE imputation step and I would hope the descriptions of the RAF/AF fields given are accurate.

Do we need to recalculate AF for single-sample VCFs or can we take care of this after merging to a cohort VCF?

danielben-isvy commented 5 months ago

It's probably easiest to just recalculate after merging to a cohort VCF.