Closed danrademacher closed 4 years ago
Yes I did the changes in the main.py and other scripts for the Dataset Identifier h9gi-nx95 and token. Also changed the fields name changes in the script respectively for collision_id, accident_date, accident_time in scripts. @danrademacher can you push those changes to scheduler. So they get automated. Let me know if you need any other changes.
Hmm, I just tested these changes on Heroku and looks like we are getting data again:
With this query in CARTO:
SELECT count(*) FROM crashes_all_prod where date_val >'2019-11-01T00:00:00Z'
We get 14559
. So that's great.
One note: Turns out I had the private key in our env
variable but it needed to be the public key. I swapped that value in Heroku settings and now the script runs as evidenced above, though I might push a change to rename variables to make it clear what key we are using. A bit confused about how Socrata is naming these things -- I pasted in our public token above in a public issue since I figured it was... public and would be verified against our secret, but maybe that is only for write/delete actions or something.
Swapped out for PUBLIC in https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/commit/213aade2c347854d66e849fb13e977f30226e046 and let it run again, looks good:
019-12-01T22:01:06.659378+00:00 heroku[scheduler.6621]: Starting process with command `python main.py`
2019-12-01T22:01:07.364201+00:00 heroku[scheduler.6621]: State changed from starting to up
2019-12-01T22:01:09.017021+00:00 app[scheduler.6621]: 10:01:09 PM - INFO - Getting data from Socrata SODA API as of 2019-10-01
2019-12-01T22:01:09.064330+00:00 app[scheduler.6621]: /app/.heroku/python/lib/python2.7/site-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
2019-12-01T22:01:09.064338+00:00 app[scheduler.6621]: InsecureRequestWarning)
2019-12-01T22:01:12.071510+00:00 app[scheduler.6621]: 10:01:12 PM - INFO - Got 32151 SODA entries OK
2019-12-01T22:01:12.071662+00:00 app[scheduler.6621]: 10:01:12 PM - INFO - Getting socrata_id list from CARTO as of 2019-10-01
2019-12-01T22:01:13.056922+00:00 app[scheduler.6621]: 10:01:13 PM - INFO - Got 32149 socrata_id entries for existing CARTO records
2019-12-01T22:01:23.942070+00:00 app[scheduler.6621]: 10:01:23 PM - INFO - Found 2 new rows to insert into CARTO
2019-12-01T22:01:23.942138+00:00 app[scheduler.6621]: 10:01:23 PM - INFO - Creating CARTO SQL insert for 2 new rows
As noted between @fahadkirmani, Christine, and me on Upworks,
It sounds like we need to change this line: https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/blob/master/main.py#L21 to point at the new data id
h9gi-nx95
. That seems very simple and either Fahad or GreenInfo has credentials needed to do it. But we also need to investigate and rewrite our script to pass an App Token.I made one under a GreenInfo Socrata account and added a new ENV variable in Heroku called
SOCRATA_APP_TOKEN_SECRET
, tied to public app token of32fNIbFNcIWEhZJfm1q6ypTNA
, but now we need to figure out the details of how we pass that to Socrata. Ideally we could just add it to our query URL, but even I assume we then need to call the secret in our code to validate. I am sure it's not hard, but takes time and a little sleuthing.@fahadkirmani can you take the first pass at looking into this? You should have all the credentials you need to log into Heroku, and push code here and there, or at least submit a PR here with a fix.