monarch-initiative / monarch-app

Monarch Initiative website and API
https://monarchinitiative.org/
BSD 3-Clause "New" or "Revised" License
18 stars 5 forks source link

Show all direct ancestors/descendants on node pages #667

Closed monarch-issues-tracker[bot] closed 6 months ago

monarch-issues-tracker[bot] commented 7 months ago

Name Horace Liu

Email horaceliu1997@gmail.com

GitHub Username @ liuaxian

Details Page: /feedback Browser: Chrome 123.0.0.0 Device: Apple Macintosh OS: Mac OS 10.15.7 Engine: Blink 123.0.0.0

The issue arises from the discrepancy between the ancestral relationship provided in the MONDO ontology and the actual presence of the "immune system disorder" term under the ancestor "MONDO:0700096" on the Monarch Initiative website.

It appears that according to the MONDO ontology, "immune system disorder" (MONDO:0005046) is listed as a descendant of "MONDO:0700096." However, upon inspection of the Monarch Initiative website page for "MONDO:0700096," there is no mention of "immune system disorder" among its descendants.

This inconsistency raises questions about the accuracy and completeness of the ontology representation on the Monarch Initiative website. It is important to ensure that the ontology structure accurately reflects the relationships between different terms to maintain data integrity and usability for researchers and practitioners in the field.

Therefore, I kindly request clarification regarding this discrepancy and any insights into why the mentioned term is not listed under its purported ancestor on the Monarch Initiative website. Clarification on this matter would greatly assist in ensuring the accuracy and reliability of the MONDO ontology for users relying on this resource for biomedical research and clinical applications. Thank you for your attention to this matter.

kevinschaper commented 7 months ago

Thank you so much for catching this @liuaxian!

I’m almost certain I know where the bug here is, these associations are populated using a generic association search command (the same that we expose externally in our API) which has a default max of 20 records per request - and we need to make sure we’re overriding that with a much higher number so that don’t silently leave off these subclass associations.

liuaxian commented 6 months ago

Thank you so much for catching this @liuaxian!

I’m almost certain I know where the bug here is, these associations are populated using a generic association search command (the same that we expose externally in our API) which has a default max of 20 records per request - and we need to make sure we’re overriding that with a much higher number so that don’t silently leave off these subclass associations.

Thank you for your response and for clarifying the issue with the API's default limitations on association . It's good to know that the default behavior restricts the number of records returned to 20 per request.

Currently, I'm facing a challenge regarding the completeness of the retrieved data. Despite attempting to use the v3 Monarch API to iteratively retrieve all subclasses under the identifier MONDO:0007179, I've only managed to obtain 149 entries. It seems that many diseases are missing from this dataset, and unfortunately, the get entity method doesn't provide a parameter to set limits.

Given this situation, I'm wondering if there's a more effective approach to retrieve comprehensive information about all subclasses under a specific ID. For instance, I'm particularly interested in obtaining a complete list of subclasses for "autoimmune disease" (MONDO:0007179).

Do you have any suggestions or alternative methods that would allow me to obtain a full count and detailed information of all subclasses associated with a given ID, such as MONDO:0007179?

Thank you for your assistance and guidance.@kevinschaper

kevinschaper commented 6 months ago

The Monarch API isn't actually set up to answer that very well yet, I think it would be possible to the association api to get subclass_of associations that take advantage of the closure to fetch all of the associations, and then take the distinct set of all object fields - but it's not a pretty solution.

Ontology Access Kit is actually much better suited to answering this question, here is an example Google Colab notebook showing an install of oaklib, as well as using this cli command: runoak -i sqlite:obo:mondo descendants MONDO:0007179 -p i

Documentation for the descendants command is here: https://incatools.github.io/ontology-access-kit/cli.html#runoak-descendants

liuaxian commented 6 months ago

Thank you for your guidance. I've explored the Ontology Access Kit and utilized the runoak command with the descendants function to resolve my query. It provided a comprehensive solution to my needs. I appreciate your recommendation and assistance in navigating this tool effectively.

kevinschaper commented 6 months ago

The parent/child query was updated in https://github.com/monarch-initiative/monarch-app/pull/671 to make sure that we don't hit the default limit of 20 anymore. This is on beta.monarchinitiative.org now, and should get to monarchinitiative.org in the next day or two.