snorkel-team / snorkel-tutorials

A collection of tutorials for Snorkel
https://www.snorkel.org/use-cases/
Apache License 2.0
392 stars 181 forks source link

Add Test/Train split; modified code so that model can be applied to new data; show some results; add some links #248

Closed gitclem closed 3 years ago

gitclem commented 3 years ago

(Sorry, I'm a noob with doing pull requests...)

I made some changes to improve getting_started.ipynb

Attached are my changes (I had to rename getting_started.ipynb to getting_started.ipynb.txt in order to attach.)

I tried to make a branch and check in my changes and make a pull request but for some reason it didn't work.

Changes made:

  1. added some internal navigation links
  2. split CountVectorizer.fit_transform() into two steps so that the resulting vectorizer transform function can be applied to new data but use the previous fit on the train data
  3. added a split of the data into train and test set
  4. output some example entries with spam probabilities from both the train and test set.

getting_started.ipynb.txt

My pull request steps:

git checkout -b improved_getting_started
git remote add upstream https://github.com/snorkel-team/snorkel-tutorials
git checkout -b improved_getting_started
(edited getting_started.ipynb)
git add getting_started.ipynb
git commit -m "improvements to getting_started.ipynb; added code showing how model can be applied to new data; added internal links; show some actual probabilities"
git push -u origin improved_getting_started
remote: Permission to snorkel-team/snorkel-tutorials.git denied to gitclem.
fatal: unable to access 'https://github.com/snorkel-team/snorkel-tutorials.git/': The requested URL returned error: 403
github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.