biolink / biolink-model-toolkit

A collection of useful python functions for looking up information and working with the Biolink Model
https://biolink.github.io/biolink-model-toolkit/
BSD 3-Clause "New" or "Revised" License
20 stars 10 forks source link

get_descendants() errors when called with an alias #163

Closed brettasmi closed 1 month ago

brettasmi commented 3 months ago

We use bmt to search biolink for compatible descendants when our components receive a query. Recently, a user attempted to query one of our systems with biolink:ameliorates, but they kept getting error messages. It turns out bmt was raising on "not a valid biolink component" for the ameliorates term. Please find below two examples of this behavior:

t.get_descendants('biolink:ameliorates')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 1
----> 1 t.get_descendants('biolink:ameliorates')

File ~/.pyenv/versions/3.11.3/envs/bmt/lib/python3.11/site-packages/bmt/toolkit.py:788, in Toolkit.get_descendants(self, name, reflexive, formatted, mixin)
    786         filtered_desc = desc
    787 else:
--> 788     raise ValueError("not a valid biolink component")
    790 return self._format_all_elements(filtered_desc, formatted)

ValueError: not a valid biolink component
t.get_descendants('biolink:realized_in_response_to')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 t.get_descendants('biolink:realized_in_response_to')

File ~/.pyenv/versions/3.11.3/envs/bmt/lib/python3.11/site-packages/bmt/toolkit.py:788, in Toolkit.get_descendants(self, name, reflexive, formatted, mixin)
    786         filtered_desc = desc
    787 else:
--> 788     raise ValueError("not a valid biolink component")
    790 return self._format_all_elements(filtered_desc, formatted)

ValueError: not a valid biolink component

Is this expected behavior?

If so, is the recommended implementation to always search for a canonical name via get_element before calling get_descendants?

Thanks--bmt is very helpful for us!

sierra-moxon commented 3 months ago

Hi @brettasmi - which version of Biolink Model (via biolink-model-toolkit) are you instantiating with your tool? The get_descendents method does call get_element as its first step, so you should be able to call get_descendents without first a call to get_element.

The get_element method does walk through the aliases in the model to try and match the string supplied by the user to an actual element in the model, but I think in this case, it isn't currently smart enough to translate biolink:realized_in_response_to, into realized in response to before doing so. This is the same situation with biolink:ameliorates (ameliorates is an alias of biolink:ameliorates_condition in v4.1.5, the default version of Biolink Model that BMT currently uses).

I'll use this ticket to update BMT to query aliases formatted in snake case as well.

brettasmi commented 3 months ago

@sierra-moxon , the tools are using 4.1.4 and the interactive testing that I pasted in the ticket was using 4.1.5. Both were with BMT 1.2.1.

Thanks for taking a look at this. In the meantime then, the suggestion is to do the string manipulation to remove the underscores and the biolink: prefix?

brettasmi commented 3 months ago

Hi @sierra-moxon, checking to see if you agree with my proposed workaround ☝️ for now. Thanks!

sierra-moxon commented 3 months ago

Hi @brettasmi - sorry for the delay, I was able to release a fresh BMT with support for "biolink" prefixed aliases checked against non-biolink-prefixed aliases.

Can you please try updating and let me know if you have issues? thanks, Sierra