genepi / imputationserver

Michigan Imputation Server: A new web-based service for imputation that facilitates access to new reference panels and greatly improves user experience and productivity
https://imputationserver.sph.umich.edu/
GNU Affero General Public License v3.0
77 stars 41 forks source link

How are multiallelic sites handled? #133

Open alkaZeltser opened 7 months ago

alkaZeltser commented 7 months ago

Hello, I'm not currently experiencing any problems, I'm just curious how the server QC checks/EAGLE/minimac4 handle multiallelic sites, both in the input data and when imputing genotypes. I've tried looking through the documentation of these tools but I couldn't find anything explicit. I've seen that QC statistics from the server report the number of detected multiallelic sites, but I've also found that some folks recommend splitting multiallelic sites in the input VCFs.

Thus, some questions:

  1. Does the QC program detect multiallelics in both split and merged form in input files? (QC checks)
  2. Are input multiallelic sites required to be split? (QC checks)
  3. Are separate alleles at multiallelic sites imputed independently of each other if at all? (minimac4/EAGLE)
  4. If multiple alleles at one site can be imputed, are they written as split records or merged records in the output VCF? (minimac4)

Thanks!