getodk / aggregate

ODK Aggregate is a Java server that stores, analyzes, and presents survey data collected using ODK Collect. Contribute and make the world a better place! ✨🗄✨
https://docs.opendatakit.org/aggregate-intro/
Other
74 stars 227 forks source link

Allow mounting configuration files via Docker Volumes #442

Closed brettneese closed 5 years ago

brettneese commented 5 years ago

I’m working on deploying an Aggregate 2.0 instance to Kubernetes in the cleanest possible way, and documenting my progress in these lab notes. The latest post describes the motivation behind this change in even more detail.

This PR allows optionally passing files via Docker volumes into a /etc/config directory in the Docker image, which gets symlinked via changes to docker-entrypoint.sh to the proper place in Aggregate's config directory. This allows complete control over the files (for instance, to use a different DB engine than pgsql.) Currently this covers:

You can supply any combination of these files, the script will only make symlinks if they exist in the /etc/config directory volume.

This will mostly be useful for Kubernetes, as you can mount the values in k8s ConfigMaps as files in k8s Pods. But you can also pass the files from a local filesystem using bind mounts using docker run command like so:

docker run -d -p 8080:8080 -it \
--mount type=bind,source="$(pwd)"/src/main/resources/odk-settings.xml,target=/etc/config/odk-settings.xml,readonly \
--mount type=bind,source="$(pwd)"/src/main/resources/jdbc.properties,target=/etc/config/jdbc.properties,readonly \
--mount type=bind,source="$(pwd)"/src/main/resources/security.properties,target=/etc/config/security.properties,readonly \
--name=aggregate aggregate:v2.0.1-10-g9c6afd90-dirty

You can still pass environmental variables to override any values set in those passed files, though I'm not sure why you would considering they are only read at startup.

What has been done to verify that this works as intended?

I have run the above command and inspected the output. I have not yet tried to spin up an ODK aggregate instance full-stack using k8s yet; I will update this PR when I can confirm that works and then we can merge it (assuming there are no objections or suggested improvements - I'm totally open to input here.)

Why is this the best possible solution? Were any other approaches considered?

This provides a more flexible solution in addition to the current env-var based solution, allowing complete control over any values in these files. I considered continuing to expand on the env-var based solution already outlined (for instance, making more replacement rules for using MySQL instead of pgsql) but that seemed rather fragile and it creates an even large disconnect between how one usually configures Aggregate using files and how one configures Aggregate in Docker using environmental variables (which ultimately get slipped into the files anyway.)

This makes the setup process super confusing, as it's not clear which values in the files documented in aggregate-config.md and database-configurations.md one can actually control via env vars in Docker, and frustrating if the proper env var mapping in the docker-entrypoint.sh script to change what you want to change hasn't been built yet.

This is much cleaner, particularly when using a Docker orchestration system that allows the injection of values into files in the running container, such as with k8s ConfigMaps. You can just include the entire contents of, for instance, securities.properties in your ConfigMap property.

Are there any risks to merging this code? If so, what are they?

Because I preserve the existing env var replacement code, I don't imagine there are, other than a bit of added docs complexity, which I tried to make as clear as possible.

Does this change require updates to documentation? If so, please file an issue at https://github.com/opendatakit/docs/issues/new and include the link below.

This PR already includes docs updates here, but when we stabilize the Docker config a bit maybe we should add a "Docker" section to the main docs.

brettneese commented 5 years ago

Can confirm this does indeed work in K8S:

image

🎉

brettneese commented 5 years ago

@ggalmazor No worries. It's totally fine if you want to leave these open for a bit - I just believe in getting feedback early and often, and working in small bits, particularly when approaching a project I'm mostly unfamiliar with (and also, this sort of came out of little tweaks I noticed as I came along - I could've done all of this without these tweaks, but the goal was to do it in the cleanest possible way.)

I'm not sure what else is necessary. I will be adding a sample K8S file, when I'm finished with that. I think there's some thing we/I could do in terms of CI - it'd be nice if the Docker image was published on Docker Hub (I noticed there was some talk about this and it never got done, yet. I'd be happy to take the lead there if I get a chance - that was part of the point of the previous PR.) This definitely seems the purview of a different PR.

It might also be nice to wrap some of the more obscure Gradle commands into a Makefile... and also, I'm not sure I like how the Docker Compose/Docker file structure currently... but that's just my OCD speaking.

brettneese commented 5 years ago

Don't merge this yet - running into some weird issues with odk-settings.xml I'm debugging right now.

brettneese commented 5 years ago

Apparently Java really does not like those symlinks and is returning FileNotFound - I'll begrudgingly update them to cps instead - that works on my machine (it's on a different branch or I'd have pushed it already.)

ggalmazor commented 5 years ago

OK, this makes sense.

I think it's safe to merge this changes. When I hava chance, I'll try the whole Docker/Docker Compose/k8s workflow. We can discuss next steps in a new issue.

ggalmazor commented 5 years ago

OK, issue created: https://github.com/opendatakit/aggregate/issues/446