bluesky / tiled

API to structured data
https://blueskyproject.io/tiled
BSD 3-Clause "New" or "Revised" License
59 stars 50 forks source link

Revisit Dockerfile #282

Closed danielballan closed 1 year ago

danielballan commented 2 years ago

We cannot simply use the FastAPI image because it lacks Tiled and Tiled’s dependencies, but would it make sense to build on it instead of building the gunicorn stack from scratch?

https://github.com/bluesky/tiled/blob/18f94198eb6f7cf77da636b7152c25e5d47b3ad7/docker/Dockerfile#L1

dylanmcreynolds commented 2 years ago

The official fastapi image is pretty easy to use, is fairly well supported and has an auto-tuning mechanism built in for scaling workers up to number of cores available. I had originally pushed to put this item in the v0.1.0 milestone. I think this is certainly open to some conversation.

danielballan commented 2 years ago

I don't think we have to tie this to any milestone, but it still would be good to look at.

danielballan commented 1 year ago

Nice documentation here on why one should or shouldn't use the FastAPI image: https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker#-warning-you-probably-dont-need-this-docker-image

In short, if there is cluster scaling happening externally (e.g. through Kubernetes) then it's best to write your own image from scratch. But if you are using docker-compose or other single-server deployments, the image is a good fit.

I conclude from this that for most (all?) current deployments the image is a good fit. In the future we may want to maintain both options:

dylanmcreynolds commented 1 year ago

Some notes from a review of the current dockerfile:

dylanmcreynolds commented 1 year ago

After discussion with @danielballan , we're rethinking this strategy entirely.

gunicorn/uvicorn is an attractive solution because it provides easy horizontal scaling of the uvicornprocesses. However, it adds complication to configuration and to deployment. Logging gets intertwined, and dealing with with other things like prometheus is tricky. It also doesn't play well with orchestration environments like k8s or docker-compose, hiding the scaling within single containers rather than letting the orchestration framework deal with it. For this reason, the official (image)[https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker#-warning-you-probably-dont-need-this-docker-image] clearly describes why you might not want to use it.

One of the reasons we've been hesitant to rip out gunicorn is that we couldn't think of a better scaling option for simple cases. We would like tiled to be easily deployable and scalalable for simple

What was holding us up with the docker-compose case. But it turns out that docker compose has a solution: you can scale docker compose services wit the --scale flag, and docker DNS load balances the requests. https://pspdfkit.com/blog/2018/how-to-use-docker-compose-to-run-multiple-instances-of-a-service-in-development/#adding-a-load-balancer

With this we'd like to out making the official image just a single uvicorn process, ripping out tons of complexity, and deliver a docker-compose.yml in documentation that shows how.

danielballan commented 1 year ago

This may be useful for the Prometheus aspect: https://stackoverflow.com/a/70804269