Open rossjones opened 9 years ago
For AWS, Elastic Beanstalk supports this user story, and even supports additional features like using a managed database (RDS) and auto-scaling.
It also supports deploying from Docker Containers, which might be the way to abstract PaaS dependencies.
I find compelling Florian Mayer’s description of his Docker-based deployment, on ckan-dev, in which he writes:
we're deploying our CKAN using Docker linux containers. In our docker image build process we copy out storage folders and the database from the (non persistent) container into persistent directories within a BTRFS snapshotting file system. That simplifies a few things for us:
- All read-only files (software, config, dependencies) are located within the Docker image, which also contains all installed extensions,
- All read/write files, the installed / set up / populated database, plus uploaded attachments are located within the persistent folder, making migration a "build the image and copy the persist folder" job,
- the snapshotting file system allows us to roll back the CKAN instance to a sane state, should bad things happen, instead of having to migrate/install.
In particular, I like the notion of keeping files that should be read-only as actual read-only files. That's better for security, that simplifies caching (read-only files are not going to change), and it simplifies backup.
Hi @waldoj,
we actually moved away from Docker containers towards a dedicated AWS VM.
Our docker setup will probably be useful for a stable, only occasionally updated CKAN version. I guess we'll docker CKAN 2.4.
The AWS VM in contrast is perfect for tinkering with the latest master branches of various plugins, and we also snapshot our file system (btrfs ftw!) so we can recover from git mess-ups. Just as in the Docker setup, we separated out valuables (postgres datadir and storage dir) into a dedicated (also snapshotted) folder. I would probably not run things this way with software in charge of finances or emergency calls, but CKAN? Absolutely fine. Never more than a ssh session or git checkout or, worst case, a filesystem restore away from sanity.
Our main reason for moving to AWS was that the latest CKAN master with a few customised extensions fixes some critical bugs (resources disappearing was a big one) and gives us some custom required features. However, rolling that into a Docker image would take an order of magnitude higher effort and be outdated too quickly.
Our docker setup will probably be useful for a stable, only occasionally updated CKAN version. I guess we'll docker CKAN 2.4. The AWS VM in contrast is perfect for tinkering with the latest master branches of various plugins, and we also snapshot our file system (btrfs ftw!) so we can recover from git mess-ups.
This was a really helpful distinction, @florianm—thank you for breaking it down like this!
ckan-multisite will be set up to run on any bare metal server or vps that allows you to run docker. If you want to mount your databases and files with a snapshotting file system you're free to do that because they're all stored in predictable locations on disk. Backups are easy: all the user data is in one place on the host filesystem (mounted as volumes by datacats) and the code is in another.
Creating and deploying instances will be from a web interface. Creating and removing a "farm environment" means installing or removing ckan-multisite, which will have simple instructions and few dependencies (docker, nginx, pip, virtualenv...).
I've created an issue for the install procedure https://github.com/boxkite/ckan-multisite/issues/5 and another for documenting the file locations on the server https://github.com/boxkite/ckan-multisite/issues/6