Open anjatietz opened 2 years ago
Hi @anjatietz ,
Do you have any solutions?
I have met the same issue as you.
Any suggestions will help.
Thank you very much! Xinxin
Hi @anjatietz, Thank you for using HATK and reporting this error.
I found the same error is replicated by me.
About a month ago, I checked the IMGT2Seq works with v3.47.0. I guess the 'ProcessIMGT.py' script can't cover some raw sequences from the latest 3.48.0 version.
I'm going to update the IMGT2Seq but this will take some time. Maybe you should use a former version of the IMGT database(<3.48.0) for now.
Hi there, I am trying to integrate a newer version of the IMGT Database for my analysis but there seems to be an issue at the DRB1 locus. I downloaded the database from: ftp://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/ and stored it in your example directory. I get maptable files for HLA-A to HLA-DQB1, but not for DRB1. Any help is much appreciated.
I used the following command: python3 HATK.py --imgt2seq --hg 38 --imgt 3480 --2field --imgt-dir example/IMGTHLA3480 --out MyIMGT2Seq/ExamplePrefix.hg38.imgt3480
And this is what I get: Namespace(Ggroup=False, HLA=None, NoCaption=False, Pgroup=False, aa=None, ar=None, bmarkergenerator=False, chped=None, condition=None, condition_list=None, covar=None, covar_name=None, dict_AA=None, dict_SNPS=None, fam=None, fourF=False, hat=None, heatmap=False, hg='38', hla2hped=False, hped=None, imgt='3480', imgt2seq=True, imgt_dir='/home/user/HATK/example/IMGTHLA3480', input=None, leave_NotFound=False, logistic=False, manhattan=False, maptable=None, metaanalysis=False, multiprocess=1, no_indel=False, nomencleaner=False, omnibus=False, oneF=False, out='MyIMGT2Seq/ExamplePrefix.hg38.imgt3480', phased=None, pheno=None, pheno_name=None, platform=None, point_color='#778899', point_size='15', reference_allele=None, rhped=None, s1_bim=None, s1_logistic_result=None, s2_bim=None, s2_logistic_result=None, save_intermediates=False, threeF=False, top_color='#FF0000', twoF=True, variants=None, yaxis_unit='10')
[ProcessIMGT.py]: Generating sequence information dictionary for HLA A.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA B.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA C.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA DPA1.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA DPB1.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA DQA1.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA DQB1.
[ProcessIMGT.py]: Generating sequence information dictionary for HLA DRB1. Traceback (most recent call last): File "HATK.py", line 243, in
myStudy = HLA_Study(args)
File "/home/user/HATK/src/HLA_Study.py", line 293, in init
_imgt_dir=_args.imgt_dir)
File "/home/user/HATK/IMGT2Seq/IMGT2Seq.py", line 196, in init
_p_data="IMGT2Seq/data", __Nfield_OUTPUT_FORMAT=Nfield_OUTPUT_FORMAT)
File "/home/user/HATK/IMGT2Seq/IMGT2Seq.py", line 401, in IMGT2Seq
_p_data, _no_Indel=_no_Indel, _save_intermediates=_save_intermediates)
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 130, in ProcessIMGT
df_Seqs_splited_noIndel_gen = df_raw_Seqs_splitted_gen.apply(lambda x : ProcessIndel(x, _remove_indel=True), axis=0)
File "/home/user/anaconda3/envs/HATK/lib/python3.7/site-packages/pandas/core/frame.py", line 6928, in apply
return op.get_result()
File "/home/user/anaconda3/envs/HATK/lib/python3.7/site-packages/pandas/core/apply.py", line 186, in get_result
return self.apply_standard()
File "/home/user/anaconda3/envs/HATK/lib/python3.7/site-packages/pandas/core/apply.py", line 292, in apply_standard
self.apply_series_generator()
File "/home/user/anaconda3/envs/HATK/lib/python3.7/site-packages/pandas/core/apply.py", line 321, in apply_series_generator
results[i] = self.f(v)
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 130, in
df_Seqs_splited_noIndel_gen = df_raw_Seqs_splitted_gen.apply(lambda x : ProcessIndel(x, _remove_indel=True), axis=0)
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 787, in ProcessIndel
return _sr.apply(lambda x: getTrimmedSeqs(x, l_spanInfo, _remove_indel))
File "/home/user/anaconda3/envs/HATK/lib/python3.7/site-packages/pandas/core/series.py", line 4045, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/lib.pyx", line 2228, in pandas._libs.lib.map_infer
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 787, in
return _sr.apply(lambda x: getTrimmedSeqs(x, l_spanInfo, _remove_indel))
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 753, in getTrimmedSeqs
IndelSeqs = pd.Series([_string[idx[0]:idx[1]] for idx in _l_target_idx[0]])
File "/home/user/HATK/IMGT2Seq/src/ProcessIMGT.py", line 753, in
IndelSeqs = pd.Series([_string[idx[0]:idx[1]] for idx in _l_target_idx[0]])
TypeError: ("'NoneType' object is not subscriptable", 'occurred at index 2')