Can't find the Symptomatic validation dataset

lingling93 commented 5 years ago

Hi Daniel : I want evaluate my model exactly as you did with four validation datasets. Three of them(Disease Modifying, Clinical Trial, Drugcentral) are easy to get from the validation-datasets in your repository. I have not found the Symptomatic dataset yet. Can you help me with that ? Thank you! Lingling

dhimmel commented 5 years ago

Hi @lingling93, it looks like we compute the relevant performance visualizations and measures in the prediction/6-vizr.ipynb R notebook. This notebook reads probabilities.tsv, which contains a category column. The positives for symptomatic indications are any pairs here with SYM for the category column. I believe the negatives are anything where category is blank (i.e. excluding the disease modifying DM indications).

The symptomatic and disease modifying indications come from PharmacotherapyDB 1.0. More info on PharmacotherapyDB is available here.

dhimmel commented 4 years ago

See https://github.com/dhimmel/learn/issues/9#issuecomment-594785230 for more information on computing the "Symptomatic" indication set of positives and negatives.

dhimmel / learn

Can't find the Symptomatic validation dataset #7