TobikoData / sqlmesh

Efficient data transformation and modeling framework that is backwards compatible with dbt.
https://sqlmesh.com
Apache License 2.0
1.76k stars 156 forks source link

Docker example #1218

Open gunar opened 1 year ago

gunar commented 1 year ago

I, and probably others as well, would greatly appreciate having some documentation and an example Dockerfile for running sqlmesh on container-based cloud hosts, such as Google Cloud Run. To make it even more convenient, it would be wonderful if you could maintain a public DockerHub image that we can easily make reference to.

eakmanrq commented 1 year ago

Yeah I agree there is value in offering this. It will be something we look into in the upcoming weeks.

jmarch commented 4 months ago

@eakmanrq / @gunar, I created a branch that provides a Dockerfile, plus basic usage documentation in the README.

See here: https://github.com/jmarch/sqlmesh/tree/add-docker-support

My branch also has a base docker-compose.yml file that can be customized by the end user (or enhanced in the branch) to include additional databases for testing. I only put postgres and mssql example databases in the config to show how this is done.

The additional database service containers are listed with profiles, such that they are optional when running docker-compose up. This means by default compose creates a local sqlmesh docker network with the base sqlmesh container, then allows for starting up particular databases that one wishes to test.

After I created these assets, I found the existing Dockerfile and docker-compose.yml files within the web directory of the main TobikoData/sqlmesh project. Note: there's no documentation around these at all. I found them after combing through the Makefile.

These existing web docker files seem primarily for testing the web UI. They leave a lot to be desired.

For example, the base image has no editor installed (no vim or nano, at least), so testing out the sqlmesh CLI is just about impossible (unless you want to like 'echo' text and redirect to your config file). Furthermore, it doesn't have sudo or sudo access for the end user to install missing packages. And, it's missing some of the sub-modules of sqlmesh (like sqlmesh[mssql]). So even if the end user chooses to use the web UI for all their editing, they are out of luck if they stray from what the base includes. Without sudo, they just can't resolve any issues by attaching to their container.

My base image solves these problems, but may not be as secure with the sudo access by default. That's something to consider if this were adopted.

Since I install all the dialects, the image balloons to I believe roughly 1.74GB. I tested all this last week, but moved on to other things, so forgive me if that's off a bit.

Anyway... no pull request, yet, as I was told none of this work was needed.