biothings / bte_regression_test

0 stars 1 forks source link

Create regression testing infrastructure based on FDA orphan drug indications [copied from biothings / BioThings_Explorer_TRAPI] #9

Open ariutta opened 3 years ago

ariutta commented 3 years ago

This issue was created in another repo: https://github.com/biothings/BioThings_Explorer_TRAPI/issues/117

Moving future discussion here.

ariutta commented 3 years ago

AS EDIT 2021-05-21: Separated out the DrugMechDB analysis into this issue https://github.com/biothings/BioThings_Explorer_TRAPI/issues/181; see strikethrough below

Copying over some content from the previous issue:

We would like to create a regression testing framework to quantitatively assess BTE's performance. As a gold standard, we can use the orphan drug indication dataset mentioned in NCATSTranslator/Relay#123 ~or the mechanistic paths from https://sulab.github.io/DrugMechDB/~. For each of those gold standards, we should create a TRAPI query (examples), send it to BTE using a small library of plausible metapaths focused on drug repurposing, and then assess whether BTE was able to retrieve the right drug among the results. (Later we can also assess where that drug ranked among all potential drugs retrieved.) We would want to execute this test on a regular basis (weekly?), and then have a simple web page where results can be viewed/browsed.

tagging @ariutta and @AlexanderPico

and

Brief summary: we now have a workflow bte_regression_test.yml that is configured to run every Saturday as well as on demand. (There's also an abbreviated version of this same workflow: bte_regression_test_quick_demo.yml).

It pulls the latest BTE TRAPI Docker image to create a local instance of the API and then runs every query template in src/query_templates against that API. It currently saves the resulting full_results.csv file as a GH Action artifact, but the future plan is to create a static page showing summary stats over time as well as a Jupyter notebook for interactively exploring the results, potentially hosted via colab or mybinder. Another future improvement: use batch queries to speed up the tests. Right now, it takes 1 hour to run the six query templates for the first two lines of the OOPD file and the better part of a day to run the entire test.

If you want to see some completed workflow runs, you can take a look at my fork: https://github.com/ariutta/bte_regression_test/actions You'll notice the full test timed out, but the quick demo did finish, so you can see the result saved as an artifact.