gyorilab / gilda

Grounding of biomedical named entities with contextual disambiguation
BSD 2-Clause "Simplified" License
39 stars 12 forks source link

Problem with HGNC term generation #17

Closed cthoyt closed 4 years ago

cthoyt commented 4 years ago

I got the following error when running python -m gilda.generate_terms:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/cthoyt/dev/gilda/gilda/generate_terms.py", line 385, in <module>
    main()
  File "/Users/cthoyt/dev/gilda/gilda/generate_terms.py", line 378, in main
    terms = get_all_terms()
  File "/Users/cthoyt/dev/gilda/gilda/generate_terms.py", line 366, in get_all_terms
    generate_hgnc_terms(),
  File "/Users/cthoyt/dev/gilda/gilda/generate_terms.py", line 73, in generate_hgnc_terms
    if row['Synonyms'] and row['Synonyms']:
KeyError: 'Synonyms'

I think there's a typo in the following code: https://github.com/indralab/gilda/blob/25a4c10c3463d511e0478364a1e65fe7ef925631/gilda/generate_terms.py#L72

Changing it to the following seems to fix it:

if 'Synonyms' in row and row['Synonyms']:
bgyori commented 4 years ago

Actually, under the assumptions of the current and past versions of the code (which I just reviewed), this should be changed to just

if row['Synonyms']:

The fact that you get a KeyError indicates that something about the input file has changed and that may have broader implications that we should look into.

bgyori commented 4 years ago

Okay, yes: HGNC renamed Synonyms to Alias symbols in their latest release and this change has been propagated into INDRA's HGNC resource file as well. So this should be changed to

if row['Alias symbols']:

and any other place where it is used.