WansonChoi / CookHLA

An accurate and efficient HLA imputation method.
25 stars 9 forks source link

Interpreting the alleles file #7

Open juliandwillett opened 2 years ago

juliandwillett commented 2 years ago

Hello, I am having difficulty interpreting the alleles file due to lack of column naming. It does not seem to be described in the readme. Could you please add this to the readme?

WansonChoi commented 2 years ago

@juliandwillett

Hi, Thank you for your comment on CookHLA.

Each column represents

  1. Family ID(FID)
  2. Individual ID(IID)
  3. HLA gene
  4. a pair of the imputed 2-digit(1-field) allele1 and allele 2
  5. a pair of the imputed 4-digit(2-field) allele1 and allele 2
  6. The posterior probability of the imputed allele 1
  7. The posterior probability of the imputed allele 2
  8. Confidence score

The last confidence score is defined as the posterior probability of the allele for a homozygous call and the sum of the posterior probabilities of the two alleles for a heterozygous call. ("Call rate and accuracy." section of p6 in the paper - https://www.nature.com/articles/s41467-021-21541-5)

I'm sorry for not providing enough information. I'll update this information in the readme.