RabadanLab / arcasHLA

Fast and accurate in silico inference of HLA genotypes from RNA-seq
GNU General Public License v3.0
115 stars 49 forks source link

Error in customize function #79

Open ttab963 opened 2 years ago

ttab963 commented 2 years ago

Hello, team.

First I want to say thanks to this nice tool. Other functions in this tool work good like genotype. in my single cell RNA sequencing data (+single end). I downloaded this tool from conda.

But when I tried to use customize for quantification, some error came out. This is the code.

arcasHLA customize \
--genotype OM-HS-063-C8-F.partial_genotype.json \
--transcriptome chr6 \
-o ./ref 

And this is the error.

None
{'A1': 'A*30:01', 'A2': 'A*24:02', 'B1': 'B*15:01', 'B2': 'B*13:02', 'C1': 'C*06:287', 'C2': 'C*01:02', 'DPB11': 'DPB1*02:01', 'DPB12': 'DPB1*05:01', 'DQA11': 'DQA1*01:71', 'DQA12': 'DQA1*02:01', 'DQB11': 'DQB1*02:02', 'DQB12': 'DQB1*06:02', 'DRB11': 'DRB1*07:01', 'DRB12': 'DRB1*15:01'}
Traceback (most recent call last):
  File "/data/anaconda3/envs/HLA/share/arcas-hla-0.4.0-0/scripts/customize.py", line 317, in <module>
    build_custom_reference(subject, genotype, args.grouping, args.transcriptome, temp)
  File "/data/anaconda3/envs/HLA/share/arcas-hla-0.4.0-0/scripts/customize.py", line 92, in build_custom_reference
    transcriptome.append(dummy_HLA_dict[transcript])
KeyError: 'ENST00000437811.1'

For explanation, ENST00000437811.1 is HLA-DPA1-243 from ensembl but I'm not sure about this is important or not. Version of my python is 3.8.12 because of this issue. If you need any additional information, I will provide.

Thanks to your kind help.

tpereachamblee commented 2 years ago

Thank you for your interest in and for using our tool! This issue likely has to do with using a IMGT/HLA reference version later than 3.32.0 owing to the static cDNA file in /dat/ref/ which has not been updated recently. Please try rolling by the reference and repeating the genotyping, then use that genotype.json for the custom module.

Portulaca666 commented 9 months ago

Hello, team.

First I want to say thanks to this nice tool. Other functions in this tool work good like genotype. in my single cell RNA sequencing data (+single end). I downloaded this tool from conda.

But when I tried to use customize for quantification, some error came out. This is the code.

arcasHLA customize \
--genotype OM-HS-063-C8-F.partial_genotype.json \
--transcriptome chr6 \
-o ./ref 

And this is the error.

None
{'A1': 'A*30:01', 'A2': 'A*24:02', 'B1': 'B*15:01', 'B2': 'B*13:02', 'C1': 'C*06:287', 'C2': 'C*01:02', 'DPB11': 'DPB1*02:01', 'DPB12': 'DPB1*05:01', 'DQA11': 'DQA1*01:71', 'DQA12': 'DQA1*02:01', 'DQB11': 'DQB1*02:02', 'DQB12': 'DQB1*06:02', 'DRB11': 'DRB1*07:01', 'DRB12': 'DRB1*15:01'}
Traceback (most recent call last):
  File "/data/anaconda3/envs/HLA/share/arcas-hla-0.4.0-0/scripts/customize.py", line 317, in <module>
    build_custom_reference(subject, genotype, args.grouping, args.transcriptome, temp)
  File "/data/anaconda3/envs/HLA/share/arcas-hla-0.4.0-0/scripts/customize.py", line 92, in build_custom_reference
    transcriptome.append(dummy_HLA_dict[transcript])
KeyError: 'ENST00000437811.1'

For explanation, ENST00000437811.1 is HLA-DPA1-243 from ensembl but I'm not sure about this is important or not. Version of my python is 3.8.12 because of this issue. If you need any additional information, I will provide.

Thanks to your kind help.

Have you already resolved this problem ? My version is IPD-IMGT/HLA 3.54.0 and I got the same error . None {'B1': 'B*51:01', 'B2': 'B*07:386N', 'C1': 'C*07:02', 'C2': 'C*15:02', 'DPB11': 'DPB1*04:01', 'DPB12': 'DPB1*04:01', 'DQA11': 'DQA1*01:03', 'DQA12': 'DQA1*01:03', 'DQB11': 'DQB1*06:03', 'DQB12': 'DQB1*06:112N', 'DRB11': 'DRB1*15:01', 'DRB12': 'DRB1*15:01'} Traceback (most recent call last): File "/gpfs1/xjhuang_pkuhpc/tools/arcasHLA/scripts/customize.py", line 317, in <module> build_custom_reference(subject, genotype, args.grouping, args.transcriptome, temp) File "/gpfs1/xjhuang_pkuhpc/tools/arcasHLA/scripts/customize.py", line 92, in build_custom_reference transcriptome.append(dummy_HLA_dict[transcript]) KeyError: 'ENST00000425337.1'

Lies-VO commented 3 months ago

https://github.com/RabadanLab/arcasHLA/issues/79#issuecomment-1059555697

This solution didn't work for me, is there anything else I can try? Thank you!