Kappa-Dev / KAMI

Bio-curation library for modelling cellular signalling
MIT License
7 stars 0 forks source link

Anatomizer issue: search by UniprotID #3

Open eugeniashurko opened 7 years ago

eugeniashurko commented 7 years ago

Initializing a GeneAnatomy object with UniProtID raises a warning that there are multiple UniProtIDs associated and then an error, for example:

>> g = GeneAnatomy("P11362")

/home/eugenia/anaconda3/lib/python3.5/site-packages/kami-0.1-py3.5.egg/anatomizer/new_anatomizer.py:414:
 AnatomizerWarning: More than one UniProt Accession Number found for 'ENSP00000432972'
  ensp, AnatomizerWarning

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-379402b9ab2b> in <module>()
----> 1 g = GeneAnatomy("P11362")

~/anaconda3/lib/python3.5/site-packages/kami-0.1-py3.5.egg/anatomizer/new_anatomizer.py in __init__(self, query, features, merge_features, nest_features, merge_overlap, nest_overlap, nest_level, offline)
   1296         if features and self.found:
   1297             # feature_list = get_features(self.canonical)
-> 1298             feature_list = get_ipr_features(self.selected_iso, self.canonical)
   1299             # construct fragments from features found
   1300             fragnum = 0

~/anaconda3/lib/python3.5/site-packages/kami-0.1-py3.5.egg/anatomizer/new_anatomizer.py in get_ipr_features(selected_ac, canon)
    567         search_ac = selected_ac
    568     entry = ipr_matches_root.find("protein[@id='%s']" % search_ac)
--> 569     matchlist = entry.findall('match')
    570     for feature in matchlist:
    571         if feature.get('dbname') not in ignorelist:

AttributeError: 'NoneType' object has no attribute 'findall'
eugeniashurko commented 7 years ago

UPD: it happends only in 'online' mod (for offline mode it works)

hmedina commented 6 years ago

Note: P11362 is not a UniProtID, it's a UniProt accession code, see the raw text page for the fields.

Back in the days of the Exploratorium we ran into what seems to be this same issue. Mitchel used to say that UniProtACs were not unique, but UniProtIDs were. So we ended up using FGFR1_HUMAN rather than P11362. That was 6 years ago, so I don't know if UniProt has changed to avoid this issue.