greenelab / snorkeling

Extracting biomedical relationships from literature with Snorkel 🏊

Other

59 stars 17 forks source link

Miscellaneous notes from the great snorkeling of 2018 #39

Open dhimmel opened 6 years ago

dhimmel commented 6 years ago

In Palo Alto.

dhimmel commented 6 years ago

Monday

[x] When using labeling functions to suppress mistagged genes, never return positive evidence, just 0 or -1. source
[x] Make LFS a dictionary of name to function

Issue

Hetionet labeling function is mostly voting 1 rather than -1 (almost all sentences seem to have a gene and disease for a relationship in Hetionet, regardless of whether the sentence attests to that relationship)

dhimmel commented 6 years ago

Tuesday

[x] Scale up to 50k labeled sentences
[ ] Consider labeling dev set
[ ] Determine how we want label probabilities to be scaled

dhimmel commented 6 years ago

Human calls for 100 development sentences

I've gone through 100 sentences, which will be useful for assessing our generative model (consensus/training labels). These are good examples to look at to see why this is a very hard problem. sentence-labels-dev.xlsx. CC @danich1