Closed cthoyt closed 3 years ago
Thanks @cthoyt! I think mondo.json is missing from version control. Is the convention for MONDO to have "namespace embedded" in IDs? In that case the right API for all the client functions would be to expect the prefix like in your test for get_id_from_alt_id('MONDO:0018220')
, otherwise without the prefix would be better. We should also think about adding all the relevant xrefs to the ontology graph, are those available through this OBO?
Thanks @cthoyt! I think mondo.json is missing from version control. Is the convention for MONDO to have "namespace embedded" in IDs? In that case the right API for all the client functions would be to expect the prefix like in your test for
get_id_from_alt_id('MONDO:0018220')
, otherwise without the prefix would be better.
Yes, it's another OBO Foundry ontology so it has the same scheme as GO (for example)
We should also think about adding all the relevant xrefs to the ontology graph, are those available through this OBO?
Yes those are available. I just realized that we hadn't explicitly done that for another OBO or OWL so far
I see, in that case, the JSON should have the MONDO prefixes embedded under id
, relations
, etc., see e.g., https://raw.githubusercontent.com/sorgerlab/indra/master/indra/resources/chebi.json. Then the IDs propagate into the rest of the code correctly without further preprocessing needed.
We got this test failure:
======================================================================
FAIL: Doctest: indra.databases.mondo_client.get_id_from_alt_id
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/doctest.py", line 2199, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for indra.databases.mondo_client.get_id_from_alt_id
File "/home/runner/work/indra/indra/indra/databases/mondo_client.py", line 43, in get_id_from_alt_id
----------------------------------------------------------------------
File "/home/runner/work/indra/indra/indra/databases/mondo_client.py", line 57, in indra.databases.mondo_client.get_id_from_alt_id
Failed example:
assert '0024812' == mondo_client.get_id_from_alt_id('0002399')
Exception raised:
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/doctest.py", line 1330, in __run
compileflags, 1), test.globs)
File "<doctest indra.databases.mondo_client.get_id_from_alt_id[1]>", line 1, in <module>
assert '0024812' == mondo_client.get_id_from_alt_id('0002399')
AssertionError
I suspect this could be due to the API assuming the MONDO: prefix in inputs and providing them in outputs?
We got this test failure:
====================================================================== FAIL: Doctest: indra.databases.mondo_client.get_id_from_alt_id ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/doctest.py", line 2199, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for indra.databases.mondo_client.get_id_from_alt_id File "/home/runner/work/indra/indra/indra/databases/mondo_client.py", line 43, in get_id_from_alt_id ---------------------------------------------------------------------- File "/home/runner/work/indra/indra/indra/databases/mondo_client.py", line 57, in indra.databases.mondo_client.get_id_from_alt_id Failed example: assert '0024812' == mondo_client.get_id_from_alt_id('0002399') Exception raised: Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.6.15/x64/lib/python3.6/doctest.py", line 1330, in __run compileflags, 1), test.globs) File "<doctest indra.databases.mondo_client.get_id_from_alt_id[1]>", line 1, in <module> assert '0024812' == mondo_client.get_id_from_alt_id('0002399') AssertionError
I suspect this could be due to the API assuming the MONDO: prefix in inputs and providing them in outputs?
I realized this was a bug in the OBO loader where the remove prefix logic was not applied to alt ids. I think it's all gucci now
The Monarch Disease Ontology is the one where most of the interesting efforts is going at the moment - they're more open an receptive to curation and are working hard to align with OBO standards. It will be the most complete disease vocabulary and have the most mappings.
This PR adds a MONDO client using the OBO client as well as fixes a bug in the parsing of alternative identifiers.
Related: https://github.com/indralab/gilda/issues/50