pretalx / pretalx-docker

Docker setup for a complete pretalx installation. Community-sourced, not officially supported.
34 stars 49 forks source link

Feat: rework Container image, Compose manifest and CI #64

Open almereyda opened 6 months ago

almereyda commented 6 months ago

This change request presents a major opinionated rewrite of this repository.

closes #21 closes #56 closes #59 ? closes #62

almereyda commented 6 months ago

This PR presents the result of 16 hrs of work to reshape it into a more "production-ready" form. This is achieved by dropping the support for local source builds.

Additionally, following the 12factor.net approach of designing web applications, it is useful to consider the parts of the application self-contained and independent from external systems. Why the reliance on a cron scheduler may be defeating the purpose of separating concerns in containers.

A convention designed for running jobs in distributed systems is to use a worker queue and have it be triggered with timers. Fortunately Celery is already used for dispatching asynchronous tasks. Its Celery Beat feature can also be used to regularly schedule jobs, removing the need for a separate cron container.

I'm especially invested in this, since the cron implementation took a quarter of the time from the overall procedure and is prone to fail in a container.

What could possibly go wrong?

- [Running Cron in Docker](https://blog.thesparktree.com/cron-in-docker) - [How to Run a Cron Job Inside a Docker Container? | Baeldung on Ops](https://www.baeldung.com/ops/docker-cron-job) - [Definitive guide on how to setup up and running cron jobs in docker containers – Yannick Pereira-Reis](https://ypereirareis.github.io/blog/2020/04/09/running-cron-jobs-docker-container-definitive-guide/)

To complete the move towards "cloud-native", distributed application design, it is also advisable to implement the database URL pattern, which had been introduced by Heroku and is of good ease, convenience and use to DevOps people.

Further on, now that the application setup is more decoupled, images are more lightweight and deterministic, due to switching to pinned release versions of Pretalx, it becomes possible to imagine to further

I'm thinking of the way how these two repositories react to their upstream, adapted from Node to Python:

In case you agree with the follow ups as outlined above, I'd step ahead and open respective issues.

almereyda commented 6 months ago

More inspiration could be taken from allmende/docker-fiduswriter.

DASPRiD commented 6 months ago

This looks generally good to me.

While in a Kubernetes environment I'm used to having CronTab jobs running periodic stuff in containers, I don't see too much harm in having crontab running within the application container. That is, unless someone wants to create replicas of the application container and run the worker things isolated on their own, but I doubt that this would be possible with Pretalx, so this is likely a non issue.

For static file serving I think that nginx as a separate container (or any other simple file server like halverneus/static-file-server) is a good choice, instead of the main container doing the serving.

👍 Overall, I'm happy with these changes.

Could we maybe consider having a GitHub workflow which automates new releases to DockerHub (or GitHub Packages)? This would avoid having bug-fixes merged into main but then not released for several months.

almereyda commented 6 months ago
almereyda commented 6 months ago

Today has seen two additions to this branch:

The workflows build and push happily into the GitHub Container registry (good for retrieving images in CI, e.g. for the extended plugins build) and into Docker Hub.

Incidentally allmende and pretalx both own both namespaces in both registries under the same name. This way the github.organization is used for the username part of the image tags.

The workflows would work right away with the existing login secrets in this repository.

This also introduces an integration branch called next, where builds can be tested, before being pushed to the stable main branch.

Test them with approving the pull request workflows attached to this pull request.

We could then see into further parametrising the workflows, esp. with regards to build arguments and the image tags. They accept workflow_dispatch triggers.

Instead of tags a release trigger is now used, which implies tags but extends them with a nice description. Which work sufficiently easy with using the GitHub UI, but can also be automated. Release triggers also apply the latest tag:

They accept workflow_dispatch triggers and instead of tags react to GitHub releases, which imply tags but extend them with a description. Which work sufficiently with using the GitHub UI, but can also be automated.

almereyda commented 6 months ago

Producing the legacy environment and the CI updates took another 6 h each. In total this PR sums up 28 h of work on the repository.

After the general availability of a build and execution environment for distribution-ready OCI containers from Pretalx, it was left to achieve point two from above:

(2) create an exemplary distribution image with a blend of plugins preloaded.

After it had been shown to work out locally, it was left to validate the changed manifests in the automated CI environment. As it shows, this comes with a lot of syntactic overhead in the declarative language of the given CI system. This became useful to implement to prepare easy switching between different build sources for Alpine or Debian and different build targets, such as linux/amd64 and eventually linux/arm64 again. This also allowed to separate the triggers for regular and extended builds.

Building the regular image and the extended image in the same pipeline might be possible by using multi-stage builds with pushing individual stages. This in return may help to reduce the overhead produced by supporting parametric builds.

Another area of concern is the image size. ghcr.io/allmende/pretalx-extended:latest currently measures at 1,5 GB. The suggestion to move towards Alpine has been acknowledged with introducing a distribution-specific suffix to the Dockerfiles.

The CI workflow now always pushes to both registries, when Docker credentials are available. Considering a single registry was preferred before, it's left open to discuss when it is intended to push into Docker Hub and when into GHCR. This behaviour can be toggled with a boolean input. Eventually one would want to push into all registries only for releases, but all intermediary, rolling builds only go to GHCR. Other options are possible.

As the linux/arm64 build is fifteen times slower, we may also come to choose to only build it for releases.

almereyda commented 6 months ago

There is another well documented Django Docker setup in:

Looking at https://github.com/kaleidos-ventures/taiga/blob/53733d6f92f724962d457b00f9bd1442968dc0b5/docker/images/Dockerfile.backend#L65 it appears possible to consider moving the rebuild step from the entrypoint.sh into the image, since the assets will not update,

almereyda commented 6 months ago

Sorry for cd06a8070ac7822dd056ecbafe7df3b6b3696d23, cd06a8070ac7822dd056ecbafe7df3b6b3696d23, 68537d46e9b9d94d9c1983e182fc7ed237cb7d06 plus the additional pull requests from the allmende namespace mentioned in the merge commits:

They happened when working in multiple environments for the original Compose, the legacy Compose and the CI adaptations, which are always a bit "verbose" in commits, due to how their side-effects are only produced and can only be tested by pushing and inspecting the response of the CI system.

rixx commented 6 months ago

@almereyda Thank you for all this work and time invested in this repository! Given your interest and time investment, I have to ask – would you be interested in becoming a maintainer for this repository?

almereyda commented 6 months ago

Hi @rixx, thanks for the kind question. Actually, this was a one-shot effort, but given others like @DASPRiD would like to share the effort, I can imagine to occasionally review PRs and merge them. The current maintainers are the people listed as organisation members, am I correct?

For now I see a few obstacles on the path ahead for the breaking changes in the rework as present:

Then there are technical questions, which would be good to settle as the pretalx-docker community:

With the multiple layers involved, each incurs opinionated choices on how to do things and which use cases to support.

If the question of becoming a maintainer to be in a position to make these choices for the community without enquiry, it is not me who we are looking for. If we can keep a small overhead of a conversation about the what's, how's and why's, please count me in.

rixx commented 6 months ago

The current maintainers are the people listed as organisation members, am I correct?

No, this repository is "community maintained", which means that I chuck the permissions at everybody who wants them after contributing to the repository, as I am neither interested nor qualified to maintain anything Docker related. (The other two members of the pretalx org are largely there as bus factor alleviation, fwiw, and aren't involved in the day-to-day development of pretalx.) The current maintainers are @Lukas2112 and @shoetten.

Will pretalx/pretalx upstream accept a PR that implements a development container together with a Compose manifest for the auxiliary containers for Postgres and Redis?

Definitely not, no. I can imagine no argument in favour – as I don't know Docker, don't use Docker, and have no interest in it either, this development container would become broken/outdated almost immediately, and would increase the maintenance burden for no real gain. All Docker related things are meant to be contained in this repository.

Which kind of release and build procedure is wanted? Do we automate all the things as much as possible, including daily jobs to check for new versions upstream and automatically generating new release artifacts, if applicable? How much room for manual intervention do we leave?

I'm very much in favour of as much automation as possible. Currently, when I release a new pretalx version, I update the submodule, tag the commit, and push to kick of the build, which is fine (I have to do a bunch of manual stuff on release anyway, like releasing a blog post, so this doesn't hurt). Definitely don't want anything more manual than this – I'd be fine with an even more automated process that just checks for new pretalx releases.

If the question of becoming a maintainer to be in a position to make these choices for the community without enquiry, it is not me who we are looking for. If we can keep a small overhead of a conversation about the what's, how's and why's, please count me in.

Completely understandable – and like I said, I appreciate all the work you've done, and the maintainership offer was just that: an offer of write access to this repository. Let me know if you want it, and no hard feelings if not.

The problem is kinda that somebody has to make those decisions, and I'm not in a position to do so, either – I will say that I'm generally in favour of few breaking changes, but on the other hand, we do advertise that this repository is community maintained and not officially recommended, so there is a bit of leeway. I think at minimum, we should have an upgrade/migration guide in the README that explains what has changed, and why, and what people ought to do. Everything else (I know I skipped several of your questions), I'd leave up to you and everybody who uses this repo or is invested in it.

almereyda commented 6 months ago

Okay, I will write up a migration guide, which seems the only obstacle to introduce these breaking changes, take the privilege and we take it from there.

I will separate the question of developing from immutable artifacts in development containers into a pretalx discussion thread later down the road. There is something to gain from the immutable, container-native approach which I would like to outline in a bit more detail.

With write access to the repository, I would then attempt to further automate the build and release procedure. Renaming the repository to pretalx-oci would maybe also help to not discriminate against Podman and other container schedulers (Nomad, Kubernetes, Swarm) and allow for a single place to collect artifacts for many possible applications of the OCI images.

almereyda commented 6 months ago

Of course https://github.com/pretalx/pretalx/commit/a7a8f2244fb7cc04a49e8524451164f65d158c33, raised from pretalx/pretalx#1773, now incurs the need to produce and run from a source build from main, which should be added to the chores. The current setup only expects to install pretalx from versioned releases. Are there maybe intermediary wheels from CI testing one could repurpose?

Also continuing the support for the standalone image in a versioned and source build variant remains within reach.

If there are other wishes for this reorganisation, please add them here and I will try to include them within the next cycle of activity until say end of the month.

almereyda commented 6 months ago

The last two days have brought new image variants, local build scripts, a build pipeline and the proposed migration hints to the README.

Bildschirmfoto vom 2024-05-23 03-33-24

For now, the pipeline can be triggered manually and by releases.

The other push and PR triggers have been removed, until this has been settled a little.

The image variants reintroduce the standalone image for easing the migration in existing setups.

The extended and cron variants from the "stock" image have also been applied to the standalone one for sake of completeness. This means people can potentially remove the external dependency to cron from their existing containerised application setup.

There are also new example contexts to build the application from source, of which one mimics the current state of affairs, namely context/source/standalone/Dockerfile.debian.local, which is included in the overlay compose/build/source/standalone.local.yml.

The one interesting for me to be able to run from main, including the fix for https://github.com/pretalx/pretalx/issues/1773, is compose/build/source/extended.cron.remote.yml.

The README tries to be complete and concise at the same time. Due to the multiple ways to approach the repository, its images, the Compose manifests and also the CI pipeline(s) we cannot be exhaustive here. Maybe the setup is complex enough to consider adding separate documentation to docs/.


This is now ready for merge with approaching the 50th hour. I would leave the click to another maintainer, as I wouldn't want to misuse the gained write-permission to the repository with my own and first contribution.

What's left for future cycles is roughly:

Also new usages for the repository appear within reach:

Ultimately, I will also raise conversations upstream about implementing Celery Beat, which would allow to get rid of the cron dependency entirely, and about a containerised development environment and why one might like that.

almereyda commented 6 months ago

Thank you for taking the time to look into this. The two points you mention reflect very well the course this has taken:

I agree that the primary use case of this repository should be to show how to run pretalx in a real-world environment, in opposition to its earlier use for building an opinionated standalone image from a tightly coupled git submodule source. The effort of trying to answer to all possible use cases makes the individual choices harder to stand out.

If there would be support by upstream, we would maybe do good in considering to move the Container image + CI manifests for base/ and default/ into the main pretalx repository, in order to build local development containers and the release artifacts where development and releases happen. This repository could then focus on the Compose setup and the image variants.

I agree to decouple the image development instructions into separate docs/ and to render the readme much more accessible for casual visitors.

Allow me to respond to the suggestions in the next two weeks. We are not in a rush here and can take the time to introduce these breaking changes in a well-documented and -tested fashion.

rixx commented 2 months ago

@almereyda Just a ping: are you still planning to work on this?

almereyda commented 2 months ago

Yes, I got sick back then. I'm currently refactoring and documenting a few Django Compose repositories, from which I plan to backport some of their patterns here, including more simple documentation, as suggested.

I'd say at least another two months, given life circumstances.

TriplEight commented 1 month ago

Thanks a lot for the PR. IMO it's over complicated for docker compose repo. It took me quite a while to understand what goes where.

I made it work: extended version with configured traefik https://github.com/TriplEight/pretalx-docker/pull/3 Can create a PR here after this one is merged.

almereyda commented 1 month ago

Great suggestions and ideas in that branch, esp. around the auxiliary services cron, Traefik. I'm also considering using more of Compose's newer include: and extends: statements to organise and layer these setups more clearly. Recent work on Funkwhale ¹ has shown these might come in handy for a Django application with many varying dependencies.

¹ fix(DX): Docker mac compatibility, dynamic DNS + Debian image (!2799) · Merge requests · funkwhale / funkwhale · GitLab

I especially like how you shoretend the healthcheck: commands, the added x-pretalx-depends-on anchor for further deduplication and how you include multiple YAML anchors with <<:

As a style note to https://github.com/TriplEight/pretalx-docker/pull/3, I would like to suggest to refrain from using the env_file: pattern, where you map the whole environment into each service's container, including all variables present in .env. With calling the variables in the environment: key individually, we only provide the selected needed blend to each service, which are pulled from the .env file. ¹ With avoiding custom configuration and resorting to conventional defaults, we also keep the command to run everything shorter.

Also defining loggin: settings for short-lived development environments appears redundant.

The FQDN for docker.io images were intentionally provided for compatibility with Podman.

I'll be able to return work on this from mid November on, with aiming for a merge in the first two weeks of December.