NCATSTranslator / minihackathons

MIT License
5 stars 5 forks source link

Workflow B: SemMedDB "Novelty" constraint #313

Open karafecho opened 2 years ago

karafecho commented 2 years ago

This issue is to follow up on Andrew Su's suggestion to consider ways to leverage SemMedDB's "Novelty" constraint by setting it to Novelty=0 either in the query itself or via another approach.

andrewsu commented 2 years ago

Just some quick notes about how we can check whether this is likely to have much/any impact before we actually write any additional code.

I believe the issue to be addressed can be seen in the ARAX response for Query B.1. The observation is that many non-specific nodes (e.g., "cytokine", "MicroRNAs", "agonists") are appearing in the results. The hypothesis is that filtering on "novelty" node property in SemmedDB (eliminating nodes where novelty = 0) would enrich for results that SMEs would care about.

The Service Provider has created a new API for semmeddb that includes the novelty score at https://biothings.ncats.io/semmeddb. So let's check a few of the B.1 results to see what their novelty scores are: