RabadanLab / arcasHLA

Fast and accurate in silico inference of HLA genotypes from RNA-seq
GNU General Public License v3.0
117 stars 50 forks source link

Genotype MHC from other species #58

Closed yuanfeng719 closed 3 years ago

yuanfeng719 commented 3 years ago

Hi, Is it possible to build a reference using MHC alleles from other species? I have the allele sequences, and I wonder which files should I modify in arcasHLA/dat/.

Thanks

IoanFilip2 commented 3 years ago

Hello! Thanks for your interest in our tool. This is a very interesting idea and should definitely be feasible with arcasHLA. But some modifications are required. The MHC database currently includes full and partial sequences for many alleles from several species, including non-human primates and dogs. You can start there, or, since you already have allele sequences in your case you can start by building the new MHC reference with the kallisto index command (see reference.py). Note that the dat/info folder should be updated to reflect your dataset. All hard-coded IMGTHLA paths should also be updated in all the scripts. The names of MHC genes are hardcoded too throughout the script - so they will have to modified as well. Finally, population-specific allele frequencies should be ignored and so the population parameters should be set to None for every MHC gene at the genotyping step. Clearly this is only an outline, but hopefully it can still be helpful to you. I agree that it would be great to include a species option in future versions of our tool! You are invited to produce such functionality and contribute to our repo!