DaehwanKimLab / hisat-genotype

GNU General Public License v3.0
23 stars 15 forks source link

Genotyping DRB3/4/5 #25

Open bingreily opened 3 years ago

bingreily commented 3 years ago

Hello! I was curious if the current version does not support genotyping of DRB3/4/5 genes. I've realized that though the mentioned genes are found inside the "hisatgenotype_db/HLA/fasta" directory, they're not found in the genotype_genome files, and in the hla.{locus,backbone.fa,allele} files. (Please do correct me if I'm wrong) Does the current version only allow genotyping of DRB1? I'd love to know if there's a way to add the DRB3/4/5 genes... because I need their allele types as well. Thank you!

chbe-helix commented 3 years ago

Hi Bing,

I am not sure why the indices for DRB 3,4, and 5 are being omitted. I will run a few tests this week to see if I can track down the bug. I will let you know if it is a code or index error and how to fix it. Stay tuned for an update later this week or early next. Thanks!

Thanks, Chris

bingreily commented 3 years ago

Hi Chris,

Thanks for the prompt reply! Just to share a bit more of what I got, I tried following yubau's way and tried out putting DRB3/4/5 genes into the --locus-list option as --locus-list DRB3,DRB4,DRB5 but got a KeyError:

Traceback (most recent call last): 
  File "/Tools/Program/Python/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/Tools/Program/HISAT-genotype/hisat-genotype-1.3.2/hisatgenotype_modules/hisatgenotype_typing_core.py", line 2688, in genotyping_locus
    dbversion)
  File "/Tools/Program/HISAT-genotype/hisat-genotype-1.3.2/hisatgenotype_modules/hisatgenotype_typing_core.py", line 384, in typing
    ref_allele          = refGenes[gene]
KeyError: 'DRB3'

I checked the hisatgenotype_typing_core.py code to see what the refeGenes[gene] was pointing to and it seems like this is taken from the genotype_genome.locus file (which I mentioned earlier does not contain the DRB3/4/5 loci, and only has DRB1).

Hope this helps out a bit in solving the problem. Thanks again for your reply, and I'm really looking forward to the update!

Jisoo

chbe-helix commented 3 years ago

Hi Jisoo,

I have looked into the situation and believe it may have something to do with the HLA database. Fortunately, I think I have a solution that will require writing a new script. Due to this I will likely need to release this fix in the next version planned for Nov/Dec of this year.

If you don't mind using a development version, I can direct you to a branch with this code once I get it written. I'm hoping I can have this change in a few weeks but may take up to the end of Sep. Let me know if you'd prefer to wait until the next release or if you'd like to use a development version and I'll get things set-up.

Thanks, Chris

bingreily commented 3 years ago

Hi Chris,

Happy to hear you may have a solution and I'm more than willing to test out the developmental version. Please keep me posted!

On a side note, I noticed the current HLA db inside the hisatgenotype_db is a bit outdated. Do you have plans on updating this in the next release? In the mean time I'm planning on trying to manually update the HLA database. I did find this post #12 about updating the db, but wasn't sure if I need to build the genome (following the istructions here: http://ccb.jhu.edu/hisat-genotype/index.php/Type:GraphRef) again after changing the contents inside hisat-genotype-1.3.2/indicies/hisatgenotype_db/HLA/ directory.

Thanks again for your quick reply! Really appreciate it. Jisoo

chbe-helix commented 3 years ago

Hi Jisoo,

You can manually update the database and you shouldn't need to rebuild the genotype genome with the way I have it currently set. I am looking at updating the database and genotype genome to a newer version in a future release.

Unfortunately, I don't have a good sense of when that will be exactly. Sorry I can't give you a more definite time-frame.

I'll update you once I get the code built to handle the DRB3/4/5 situation.

Thanks, Chris

npatel-ah commented 3 years ago

Hello Chris,

I am also interested in genotyping DRB alleles, wondering if you plan to release a new version soon? Also will be happy to test code from dev branch.

Best, Nihir

chbe-helix commented 3 years ago

Hi Nihir,

I have not yet addressed this error yet. I hope to have a new version of HISAT-genotype done by the end of Q1 2021. I will certainly let you know as soon as there is code in the development branch that addresses this issue if you would like to test it. Sorry for the inconvenience and delay!

Thanks, Chris

npatel-ah commented 3 years ago

No worries. Thanks for the update Chris.

npatel-ah commented 2 years ago

Hello Chris,

Hope you are doing well, l am curious if the development for this tool is still progressing and there is a plan to release the next version with the DRB fix, I understand if things have been changed just wanted to check.

Thanks.

dcolomb1 commented 2 years ago

Hello, also interested on this topic. Best, Daniele

marchoeppner commented 1 year ago

Just a friendly bump, would be nice if the HLA database/typing could be updated to include the above mentioned genes.