EGA-archive / beacon2-ri-api

Beacon v2 Reference Implementation (API)
Apache License 2.0
16 stars 38 forks source link

hierarchical query with parent node to get all individuals under parent node returns no results #359

Open dcci1 opened 1 month ago

dcci1 commented 1 month ago

I've setup a Beacon API instance and I want to perform a hierarchical query with a parent node ontology term to get all individuals under the parent node. I currently get 0 results for a query like that even if there are children under the parent node.

I am using a slightly different data model than the one provided in the dummy data and I've tested multiple queries so far and everything else is working. I've also ran the commands provided in the deploy readme. Any help would be much appreciated.

costero-e commented 2 weeks ago

Hi @dcci1, thank you for reporting the issue.

To make a descendant terms query work, please, check out if you have followed the next steps:

  1. Verify you have the "parent term" correctly inserted in filtering terms collection, like this:
    {
    "type": "ontology",
    "id": "NCIT:C42331",
    "label": "African",
    "scopes": [
    "individual"
    ]
    }
  2. Make sure you have inserted the descendants you want to accept for the "parent term" in the collection similarities, like this (adding nigerian term under african):
    {
    id: 'NCIT:C42331',
    descendants: [
        'NCIT:C43834'
    ]
    }
  3. Throw the query using the parent term (just using a get through the browser in this example):
    http://localhost:5050/api/individuals?filters=NCIT:C42331

Doing this, my queries work. Please, let me know if this was your issue or if you had another issue and I did not understand well.

Thanks, Oriol

dcci1 commented 6 days ago

Hi @costero-e thanks for your response.

I am currently only inserting biosamples, individuals and datasets into the database when I up the docker containers. I run the docker cp and docker exec commands, then run reindex.py and extract_filtering_terms.py.

In my individuals.json file I sometimes only have data that is a child of a parent. Will I need to manually insert all the parents in the filtering terms collection and update the similarities collection as well?

Also do you recommend creating a similarities.json and filtering_terms.json file to include as part of my initial database seeding process?

Thanks

costero-e commented 5 days ago

Hi @dcci1,

thanks for your reply. Although I said add the descendants, the process can be done the other way round if you insert the parents for the child term you have in filtering terms. Hence, no need of inserting more filtering terms 👍 . On the other hand, the extract_filtering_terms.py script that you execute is just to avoid having to insert all the terms "manually" but this task in theory shoud be done cherry picking the ones you want your beacon to allow in a query, so more than creating a .json file or a similarities.json file, I would first think in the list of filtering terms and similarities you want to include for your beacon and then create them either in a .json file or by introducing them in mongoDB directly.

Let me know if you have any other doubts.

Best,

Oriol