CityOfLosAngeles / aqueduct

A shared pipeline for building ETLs and batch jobs that we run at the City of LA for Data Science Projects. Built on Apache Airflow & Civis Platform
Apache License 2.0
21 stars 6 forks source link

Fix initial path for voila service #338

Open ian-r-rose opened 4 years ago

ian-r-rose commented 4 years ago

Right now, setting the notebook path for a voila service doesn't work very well -- things time out and the service continually restarts. Instead, you have to start at a root path and navigate to a notebook.

I'm not sure why it works if you navigate there, but not if you try to go there initially -- perhaps it is failing the civis health check, and restarting? Is there some way to make it more tolerant of slow start-up times?

ian-r-rose commented 4 years ago

EOD notes:

I think the issue here is that the civis proxy is doing aggressive health checks on the service (every few seconds). If we have the default url be the voila render process, each of these spins up a new kernel (expensive for a workhorse analysis!), until the pod crashes. At least, this is my going hypothesis.

Two options:

  1. Figure out how to point the status check at a different endpoint. There are some docs that describe this, but it is not clear that they are up-to-date?
  2. Be smarter about constructing the share url, using the civis_service_token parameter, rather than the civis-platform-auth endpoint I had been using. I think this is probably the easier way forward. If share URLs are constructed infrequently, then it is not too onerous to also pass a notebook render path into it.
ian-r-rose commented 4 years ago

Another update: Using the civis-platform-auth endpoint sets a cookie in the client which allows subsequent requests (e.g., websockets or static files) to occur without error. Using the civis_service_token approach doesn't have the same feature, unfortunately.

This leaves either (1) pointing the status check at a different endpoint, or (2) adding another url parameter for the auth endpoint to redirect to a subpath after authentication.

I prefer (2)