TranslatorSRI / NodeNormalization

Service that produces Translator compliant nodes given a curie
MIT License
9 stars 6 forks source link

/get_semantic_types endpoint is not working #153

Closed gaurav closed 10 months ago

gaurav commented 1 year ago

Both NodeNorm-dev and NodeNorm-prod are reporting that they don’t have any semantic types at the /get_semantic_types endpoint, which means that the /get_curie_prefixes endpoint doesn't work either. This is probably an error in loading the appropriate data into redis_connection3:

https://github.com/TranslatorSRI/NodeNormalization/blob/4b2caf9326ba5de6a8f2709a13e1823ff7c404ac/node_normalizer/server.py#L213-L232

Note that the documentation for both of these endpoints uses the old semantic types as examples and should also be fixed.

amykglen commented 1 year ago

is there any estimate on when this will be fixed? the synonymization portion of our RTX-KG2 build system relies on the /get_semantic_types and /get_curie_prefixes endpoints so we're currently unable to complete our Biolink 3.0 build. we're trying to gauge whether we should come up with a workaround or wait for this to be fixed.

gaurav commented 1 year ago

Hi @amykglen! I think it could be up to 3-4 weeks before we can get /get_semantic_types and /get_curie_prefixes fully up and running again. This is clearly a pretty old bug, since even NodeNorm-ITRB-Prod (which hasn't changed in a while) isn't returning this information. We might get lucky and stumble upon the bug pretty quickly, but I don't want to get your hopes up! I will report back as soon as we have an update.

In the meantime, I've modified one of my Babel validation programs to produce output that might be close enough to the /get_curie_prefixes output for your immediate needs. I've added some code that counts the number of times each prefix is used in each Biolink type. Unlike /get_curie_prefixes, this won't use this Biolink hierarchy, and so will only provide direct counts for Biolink types in Babel, i.e. it will return a count of identifiers for bothbiolink:SmallMolecule and its parent type, biolink:ChemicalEntity, but the count for the latter won't include the count for the former. My plan is to use this to cross-check the /get_semantic_types and /get_curie_prefixes outputs once they're working again, but if I can get this code working in a day or two, would that be a workable stop-gap solution for your workflow while you wait for the API endpoints to be fixed?

amykglen commented 1 year ago

thanks @gaurav! we found a cached copy of NodeNormalizer info on our end that we were able to use to get what we needed. but thank you for offering that stop-gap solution.

gaurav commented 1 year ago

When I was previously trying to figure out what was going wrong, I think I found the code that was supposed to be calculating semantic type information in the NodeNorm loader: https://github.com/TranslatorSRI/NodeNormalization/blob/a7b85f0511bf89ef98b71ba26b36b97822045296/node_normalizer/loader.py#L446-L484

I've confirmed that this is broken in r3_nodenorm 2.0.9, but I haven't tried the code in the current master branch. If that code is ready for release, I can try it when I figure out how to get the current Babel run to complete.

amykglen commented 1 year ago

hey @gaurav - is /get_semantic_types still expected to be broken at this point? we will use cached info again if so, but just wanted to check. thanks!