Closed michplunkett closed 8 months ago
Updated XGBoost model as well:
python -m incident_scraper build-model
[nltk_data] Downloading package wordnet to
[nltk_data] /Users/michaelp/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
Accuracy Score: 0.707519351271655
Precision Score: 0.8793890449438202
Recall Score: 0.7726361252506556
Extra test:
(ucpd-incident-scraper-py3.11) michaelp@MacBook-Air-18 ucpd-incident-scraper % make three_days
python -m incident_scraper days-back 3
[nltk_data] Downloading package wordnet to
[nltk_data] /Users/michaelp/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
Beginning the UCPD Incident scraping process.
Finished with the UCPD Incident scraping process.
11 total incidents were scraped from the UCPD Incidents' site.
API queries_quota: 60
This incident has an insufficient number of keys: {}
1 of 11 contained malformed or voided information.
0 of 11 could not be processed by the GoogleMaps' Geocoder.
10 of 11 incidents were successfully processed.
Adding 10 of 11 incidents to the GCP Datastore.
Completed adding 10 of 11 incidents to the GCP Datastore.
0 of 1 'Information' incidents predicted into other categories.
1 of 11 incidents could NOT be added to the GCP Datastore.
Program shutting down, attempting to send 2 queued log entries to Cloud Logging...
Waiting up to 5 seconds.
Sent all pending logs.
(ucpd-incident-scraper-py3.11) michaelp@MacBook-Air-18 ucpd-incident-scraper %
Describe your changes
Addressing invalid location returns for incident locations containing 'between', 'to', mulltiple 'and's, etc. This change should be the final one that allows me to return valid addresses for 99.4% of incidents.
Checklist before requesting a review