plone / demo.plone.org

https://demo.plone.org
MIT License
4 stars 2 forks source link

Instances are no longer reset properly when deploying #46

Open fredvd opened 6 months ago

fredvd commented 6 months ago

While working this week on #40 and fixing demo.plone.org because we forgot to pin plone.distribution and other packages, I noticed the instances don't get reset properly when doing a deploy to the cluster with docker-stack-deploy. I had to remove the stack before deploy to see my plone.distribution fixes getting deployed.

@ericof did something clever in the past by generating a SEED variable, passing this in to the Dockerfiles and then generating new images every day. But this no longer seems to work reliably. Also, it costs a lot of CI/CD energy and we stuff our organisational plone GHCR account with new images.

Can we do something smarter here? Maybe using a docker cronjob solution like https://github.com/crazy-max/swarm-cronjob to restart our backend demo containers. Then we let the container do a create site on startup in the docker-entrypoint.sh and then start the site could solve our problem.

[side note: I also don't like the current 12 hour period between restarts from a security perspective. And the reset is not reliable at the moment, so created 'demo' content could survive for days.

What is happening now is that we call the create-site at the end of the docker file, so the site state is in the generated container image.

The downside: when we regenerate the site (with plone.distribution) on every container startup is that if for whatever reason the container is roaming to another worker the instance is reset. So people testing the site with some demo content would loose their content at an unpredictable time if the cluster decides to restart the image. But that should be more exception than default.

Thoughts? Then I'll start implementing this.

davisagli commented 6 months ago

I like the idea of resetting the content when the container starts. It'd be nice to trigger a new docker stack deploy so that the new container fully starts before the service routes traffic there (to avoid downtime). We'll need to tweak the startup time to be sufficient for generating the content. I wouldn't worry too much about people losing content at unpredictable times; it's a demo site.