NaegleLab / CoDIAC

Other
0 stars 0 forks source link

KeyError while fetching domain architectures #56

Closed adshimpi closed 3 months ago

adshimpi commented 4 months ago

Description

During domain architecture fetching certain UniProt IDs will lead to a KeyError associated with retrieving domain information and order. The error is consistent across these as lacking a key with the value of 0 (See 1st and 2nd screenshot). It is arising from the get_domains_from_response function in the InterPro module when calling the return_expanded_domains (See 3rd screenshot).

Screenshots

Example error
image
Commonality of the KeyError
image
Code associated with the errors
image

Files

Interpro.py in the get_domains_from_response function

To Reproduce

Steps to reproduce the behavior:

accession_OI = ["A0A075B6H7","A6NN92",'A0A087WV53']
 UniProt.makeRefFile(accession_OI,'Debugging_Test.csv')

Diagnostic Code to determine error source

x = InterPro.fetch_InterPro_json(["A0A075B6H7","A6NN92",'A0A087WV53'])
d_dict = {}
for p in ["A0A075B6H7","A6NN92",'A0A087WV53']:
    inner_dict = {}
    for i,entry in enumerate(x[p]['results']):
        #print(entry) #Use this to see the actual entry values
        if entry['metadata']['type'] == 'domain':
            inner_dict[i] = InterPro.collect_data(entry)
    d_dict[p] = inner_dict

Looking at each entry they lack a 0 key as it appears that these have a InterPro domain family as the first entry, which through enumerate is no longer the 0th entry in the d_dict variable.

Expected behavior

Should be fetching domain architectures without issue.

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at, if known