GreenInfo-Network / nyc-crash-mapper-etl-script

Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.
3 stars 0 forks source link

Update requirements and README for local running with sendgrid #26

Closed danrademacher closed 4 years ago

danrademacher commented 4 years ago

In the README, we say that given a .env file, we can run main.py locally like this:

Install Python requirements:

pip install -r requirements.txt

Running Locally

Run the script using Python 2.7 by doing:

python main.py

I have done that successfully in the past. But now, it fails on :

  File "main.py", line 15, in <module>
    from sendgrid import SendGridAPIClient
ImportError: No module named sendgrid

That's after a fresh run on pip install. I also tried adding sendgrid to requirements.txt and running again, same error.

Can you update the README, requirements.txt or anything else we need to get local running working again on a fresh install of this repo?

danrademacher commented 4 years ago

Fir example, I see new config vars in Heroku, but those are not reflected in README image

danrademacher commented 4 years ago

wait, I missed one: image

danrademacher commented 4 years ago

Added those to my local .env and unsurprisingly got the same error: image

I imagine we must need to add something to requirements.txt that actually installs sendgrid -- though even better would be a way to ignore sendgrid when one is running manually from CLI, since we do that mostly when the thing is broken and we already know it.

fahadkirmani commented 4 years ago

Yes I have pushed the changes for sendgrid in requirements.txt and README.txt Now it will work fine on you local environment.

fahadkirmani commented 4 years ago

Hi Dan,

I did the change in requirements.txt and README file and pushed the changes into the git. Sorry for not doing those change before as unaware of it. If Christine or you can create an email group then I can forward the error email to that group. I am surprised why error notification did not worked in production environment as I have tested it properly and it worked fine on my end.

I think for now we are facing rate limit issues for now with the script and need to revamp the script. Here are my suggestion for it. 1-Convert the code to python3 2-use panda library for it https://dev.socrata.com/foundry/data.cityofnewyork.us/h9gi-nx95

As it will help us for pagination and other stuff to reconcile data with our system. 3-Do error, success and reports notification using slack channel so we will be up to date on daily basis about ETL process.

Let me know what you think.

Kind regards Fahad

On Sat, Mar 7, 2020 at 3:44 AM Dan Rademacher notifications@github.com wrote:

Added those to my local `/env' and unsurprisingly got the same error: [image: image] https://user-images.githubusercontent.com/1423200/76128305-cc093c80-5fb8-11ea-8e9c-c679d15c2edc.png

I imagine we must need to add something to requirements.txt that actually installs sendgrid -- though even better would be a way to ignore sendgrid when one is running manually from CLI, since we do that mostly when the thing is broken and we already know it.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/issues/26?email_source=notifications&email_token=AGOBOC6I5TAVTCL2PACHCHDRGF4D7A5CNFSM4LCSMGUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEODC5IQ#issuecomment-595996322, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGOBOC5LPT5TSVJV6H2G7FDRGF4D7ANCNFSM4LCSMGUA .

danrademacher commented 4 years ago

Hmm. I just pulled the latest and reran pip install but got same results:

Traceback (most recent call last):
  File "main.py", line 15, in <module>
    from sendgrid import SendGridAPIClient
ImportError: No module named sendgrid

Upgrading to Python 3 seems wise, but shouldn't be required to solve this urgent data gap issue. As for Pandas, I don't see how that would help us since pagination on the SODA side is not our issue. Having too much on the CARTO side is. Probably we need to more fully refactor the CARTO importer to use the BATCH API, but that would be under #25

Tomorrow, I'll have @gregallensworth on my team try to run the import script locally to see if he has any better luck.

danrademacher commented 4 years ago

Actually, confirmed I can run this locally again. I forgot to export the new env variables after rerunning the install