Production Worker Notes

banesullivan commented 3 years ago

Some notes about the status of the production worker from a meeting I had with @brianhelba last week.

Related PRs: ResonantGeoData/ResonantGeoData#231, ResonantGeoData/ResonantGeoData#230, ResonantGeoData/ResonantGeoData#122, ResonantGeoData/RD-OpenGeo#4

Current Issue with the production worker: it is running out of memory because our dependencies are too big and utilizing our current worker's memory budget. As soon as we try to kick off a task to do anything, we either come dangerously close to the limit or go over the limit and get shut down by Heroku.

Long Term Solution: Perform an audit on our dependency graph, cut out anything not directly used, and reduce dependencies that require non-python-wheel-ized (to avoid apt installing) libs (e.g. opencv's headless variant instead of its default, etc.). After GDAL which is relatively well handled, the big dependencies come from kwimage/kwcoco's graph. We will likely need to go into those projects for most of the audit. If we want to spin up bigger workers, it is easy. We just have to edit the name of the Heroku worker parameters in terraform/main.tf. Brian and I discussed using the standard-2x option for the worker as it seems reasonably priced (see ResonantGeoData/ResonantGeoData#122)

Short Term Solution: Turn off the worker on Heroku (#231) and set up a local worker on one of our machines. This is quite easy in practice and enables us to have a powerful worker for demos or running expensive operations like populating the DB with example data (#221) while circumventing issues like running docker images within the worker (#159).

Connecting to Production from a Local Machine

Export necessary environment variables with command: heroku config -a resonantgeodata -s.
Place those variables in the dev/.env.docker-compose file.

Running a Worker Locally

Connects to the production database/instance and tasks triggered there will run on the local machine. Make sure the number of Heroku workers is currently set to 0 before doing this.

Run docker-compose: docker-compose up celery.

Populating DB with example data

Run the demo_data command from ResonantGeoData/ResonantGeoData#221 to populate the production database with example data. This can take a while and should be run from a local machine. This saves us from needing to spin up worker on Heroku ($$$).

Run docker-compose run --rm django ./manage.py demo_data

Resetting the Production Database

django-extensions has a reset_db command: https://django-extensions.readthedocs.io/en/latest/reset_db.html

Run: heroku run -a resonantgeodata ./manage.py reset_db
Manually re-deploy the app to have it run the migrations: from the Heroku web interface, under "Deploy" -> "Manual deploy".

Be sure to set the user and password as passed arguments

Accessing Production Logs

Go to the "Resources" tab on the Heroku portal. See the "Papertrail" resource for all of the logs.

https://my.papertrailapp.com/systems/resonantgeodata/events

brianhelba commented 3 years ago

It's looking like resetting the production database via ./manage.py reset_db may not be possible, given the intrinsic access limitations that Heroku Postgres gives us. I'll continue to investigate, as this is something that nearly all projects will face.

banesullivan commented 3 years ago

Closing this issue because it is not actionable - great reference material though

ResonantGeoData / RD-OpenGeo