zhengxwen / HIBAG

R package – HLA Genotype Imputation with Attribute Bagging (development version only)
https://hibag.s3.amazonaws.com/index.html
29 stars 7 forks source link

missing SNPs #24

Open JingjingBai2021 opened 1 year ago

JingjingBai2021 commented 1 year ago

Hi, I am trying to predict the HLA-B using the pre-fitted model and got the following output. I do not understand why there are 72.6% missing SNPs for the Pos+Allele matching type. Is that a normal phenomenon, given the highly polymorphic properties in HLA regions?

My raw dataset was genotyped using Global Screening Array and I used the corresponding model.

Thank you in advance.

###############Output######### HIBAG model for HLA-B: 500 individual classifiers 791 SNPs 88 unique HLA alleles: 07:02, 07:04, 07:05, ... Prediction: based on the averaged posterior probabilities Model assembly: hg19, SNP assembly: hg19 Matching the SNPs between the model and the test data: match.type="--" missing SNPs #
Position 26 (3.3%) *being used [1] Pos+Allele 574 (72.6%)** [2] RefSNP+Position 27 (3.4%)
RefSNP 27 (3.4%)
[1]: useful if ambiguous strands on array-based platforms [2]: suggested if the model and test data have been matched to the same reference genome Model platform: Illumina 1M Duo / Infinium Global Screening Array of SNP loci with flipped alleles: 367 of SNP loci with swapped strands: 365 of samples: 4050 CPU flags: 64-bit of threads: 8