biothings / biothings_explorer_archived

BioThings Explorer: a schema-based client for API interoperability
Apache License 2.0
14 stars 14 forks source link

better document /jupyter notebooks/drug_response_predict.ipynb #137

Open andrewsu opened 3 years ago

andrewsu commented 3 years ago

We have a very nice notebook that demonstrates annotating nodes and edges in BTE queries, and using those annotations to filter and/or rank. https://github.com/biothings/biothings_explorer/blob/master/jupyter%20notebooks/drug_response_predict.ipynb

To make it easier to digest for new users, let's add some additional documentation. For example, the notebook currently contains this information in a markdown cell:

For the first step of the query (Disease - Gene), we specify to use a Normalized Google Distance (NGD) filter which only return associations with NGD score less than 0.6. For the second step of the query (Gene - Chemical), we specify to filter for association only under the context of Breast Cancer and limit the predicate to gene_has_variant_that_contributes_to_drug_response. Finally, we ask the results to be annotated with drugPhase, nodeDegree and survivalProbability information.

It would be good to explain where the NGD, association data, drugPhase, nodeDegre, and survivalProbability information come from, since these are not simple queries on the SmartAPI meta-kg.

similarly, in df2 = pd.display_table_view(extra_fields=["drug_phase", "survival_prob_change", "ngd"]), it would be good to explain what the extra_fields parameter is doing.

There are probably additional opportunities to better clarify in the inline documentation what is happening and the intent of the operations.