zhengxwen / HIBAG

R package – HLA Genotype Imputation with Attribute Bagging (development version only)
https://hibag.s3.amazonaws.com/index.html
29 stars 7 forks source link

Force monomorphic SNPs in model? #7

Closed cbrunoels closed 5 years ago

cbrunoels commented 5 years ago

I have a dataset that I am training a HIBAG model on, which I would then like to combine with an existing HIBAG model so that I can incorporate HLA alleles outside of my training set. In order to do so, I have set the snpid in my training data to the pre-existing model's snp.id set (N = 966 SNPs). I can confirm that the length of my train.geno$snp.sel is in fact 966, but when I begin to train my HIBAG model it immediately removes 6 SNPs with the line: Exclude 6 monomorphic SNPs

Removing monomorphic SNPs before training a HIBAG model makes plenty of sense, but is there any parameter that allows me to force these SNPs into the model? Without those 6 SNPs in my model, I cannot combine my own trained model with the other model of interest. Here is the error code I receive: Error: identical(obj1$snp.id, obj2$snp.id) is not TRUE, where the only differences between the snp.id values are the 6 monomorphic SNP sites. Is there some way to force monomorphic SNPs into the model so that I can ultimately combine them?

zhengxwen commented 5 years ago

A new option mono.rm=TRUE is added to the latest version of HIBAG:

hlaAttrBagging(hla, snp, nclassifier=100L, mtry=c("sqrt", "all", "one"), prune=TRUE, na.rm=TRUE, mono.rm=TRUE, verbose=TRUE, verbose.detail=FALSE)