greenelab / snorkeling

Extracting biomedical relationships from literature with Snorkel 🏊
Other
59 stars 17 forks source link

Extract epilepsy-associated genes trained using Hetionet v1.0 relationships #10

Closed danich1 closed 7 years ago

danich1 commented 7 years ago

Closes #7 Closes #2

This pull request contains working jupyter code that will parse the epilepsy abstracts, run a few labeling functions and run a generative model (Naive Bayes) to model the labeling functions. @dhimmel Let me know what you think. Side Note: I intentionally left out the data for the notebooks because the database is about ~282.2 mb

dhimmel commented 7 years ago

@danich1 let me know when you'd like me to review.

danich1 commented 7 years ago

After lots of activities, I now say that this PR is back on-line. I made a commit that has the recent updates. I know there are more changes to come but @dhimmel feel free to take an initial look.

dhimmel commented 7 years ago

epilepsy_tags_shelve.dat is ~30MB. If important to track, use LFS, otherwise remove. epilepsy_tags_shelve.bak shouldn't be tracked.

http://stackoverflow.com/a/16231228/4651668

dhimmel commented 7 years ago

@danich1 unless there is anything else you'd like to put into this PR, I'll merge it.

dhimmel commented 7 years ago

For reference, model_predictions.tsv and features.tsv are the predictions and feature weights.

danich1 commented 7 years ago

Just added the small report I wrote for the rotation project. Feel free to edit if anyone has time. Other than that feel free to merge everything.