GreenInfo-Network / nyc-crash-mapper-etl-script

Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.
3 stars 0 forks source link

Puzzling through getting a static IP for the ETL script process on heroku #42

Closed danrademacher closed 11 months ago

danrademacher commented 11 months ago

Christine asked me a couple of weeks ago about getting a static IP assigned to the ETL script process on Heroku, since they have changed the Walkmapper database and need to safelist specific IPs for database access.

This is for the scripts here: https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/tree/master/walkmapper

Not surprisingly, Heroku does not provide static IPs. Christine found a Heroku addon that does so, accessbile form the app on Heroku here: image

I updated the ENV vars on that app to be the correct ones for the new database, and also updated the record in 1P

Then I asked the client to also safelist our office IP address. Loading the ENV vars locally and running python walkmapper/mysql2carto.py works as expected. I can access the DB and data is updated in CARTO.

However, the QuotaGuard proxy is not configured properly.

When I run from Heroku, I get this:

heroku run bin/qgtunnel python walkmapper/mysql2carto.py
Running bin/qgtunnel python walkmapper/mysql2carto.py on ⬢ nyc-crash-mapper-etl... up, run.5677 (Basic)
Traceback (most recent call last):
  File "/app/walkmapper/mysql2carto.py", line 404, in <module>
    ObstructionMyqlToCartoLoader().run()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/walkmapper/mysql2carto.py", line 64, in __init__
    self.db = MySQLdb.connect(host=DB_HOST, port=int(DB_PORT), user=DB_USER, password=DB_PASS, database=DB_NAME, cursorclass=MySQLdb.cursors.DictCursor)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.heroku/python/lib/python3.11/site-packages/MySQLdb/__init__.py", line 123, in Connect
    return Connection(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.heroku/python/lib/python3.11/site-packages/MySQLdb/connections.py", line 185, in __init__
    super().__init__(*args, **kwargs2)
MySQLdb.OperationalError: (1130, "Host 'ec2-34-207-80-161.compute-1.amazonaws.com' is not allowed to connect to this MySQL server")

Seems clear the IP is just not configured.

I tried this test they recommend and it seems to return an expected result:

heroku run 'curl -x http://94hgudyixsxiqy:9ftkojqjj578892bno9dxo2icxltj@us-east-static-05.quotaguard.com:9293 -L ip.quotaguard.com'
Running curl -x http://94hgudyixsxiqy:9ftkojqjj578892bno9dxo2icxltj@us-east-static-05.quotaguard.com:9293 -L ip.quotaguard.com on ⬢ nyc-crash-mapper-etl... up, run.4035 (Basic)
{"ip":"54.88.136.216"}%

So I filed this suport request:

I need some help setting up QuotaGuard for a simple app on Heroku that runs Pythons scripts to harvest data from a few client databases.

I can access those databases locally with my own IP that is safelisted, but I can't get the Heroku app to access the databses, I believe because I have not properly configured QuotaGuard.

I tried following the instructions here, https://www.quotaguard.com/docs/language-platform/heroku/heroku-shield/, but I am not able to find the value for QUOTAGUARDSHIELD_URL in my Heroku config.

When I attempt to connect to the remote datbase, I get "MySQLdb.OperationalError: (1130, "Host 'ec2-18-212-22-136.compute-1.amazonaws.com' is not allowed to connect to this MySQL server")" -- which suggests to me the IP connection to QuotaGuard is not actually happening.

┆Issue is synchronized with this Asana task

danrademacher commented 11 months ago

I filed a support ticket with the qgtunnel folks and we figured it out.

the qgtunnel file ()in this repo) has to look like this:

[qgtunnel.Walkmapper_MySQL]
accept = "127.0.0.1:3306"
connect = "142.11.243.132:3306"
encrypted = false
transparent = false

And the big thing was that the settings for Heroku for WALKOBSTRUCTION_MYSQL_HOST have to be 127.0.0.1 because the configu above essentially tunnels from there to 142.11.243.132 via one of the two static IPs they provide as a service.

Makes sense.