michplunkett / ucpd-incident-scraper

This code is going to be used to scrape the UCPD Daily Incident page at a pre-determined frequency and store the incidents on a generic JSON data-store.
MIT License
3 stars 2 forks source link

Diminish NLTK corpus size #42

Closed michplunkett closed 9 months ago

michplunkett commented 9 months ago

Describe your changes

Addressing the issue below that break the Heroku build.

[nltk_data] Downloading package wordnet to
[nltk_data]     /tmp/build_207dc0f1/.heroku/python/nltk_data...
-----> Discovering process types
       Procfile declares types -> scrape
-----> Compressing...
 !     Compiled slug size: 508.8M is too large (max is 500M).
 !     See: http://devcenter.heroku.com/articles/slug-size
 !     Push failed

Checklist before requesting a review

HERE IS SOME COMMAND LINE OUTPUT