kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
58 stars 65 forks source link

Provide and Support an official Docker Image #4313

Open Kathrin-Huber opened 3 years ago

Kathrin-Huber commented 3 years ago

Description

Currently, the situation regarding Docker Images for Kitodo isn't clear for Integrators and Developers whishing to Run Kitodo in a containerized environment. The "official" Docker Image provided by the association on D ocker Hub isn't tagged as official and the corresponding Github Repository is not managed by the association. There is also no mention of the Docker Image in the Documentation of Kitodo Production.

To further strengthen the Usage of Kitodo in a Containerized Environment, we propose for the association to fully adopt the current Docker Image created by the Manheim University Library (or to create a new one if necessary), including it in the official Documentation for Kitodo Production and to coordinate it's development through the Release Management of the Application.

To-do

Estimated Cost and Complexity

This is a low-range project for less than 6 PT.

stefanCCS commented 3 years ago

If we bring KITODO.PRODUCTION to Docker (which I would like very much), in my opinion the first topic is to create a Docker-Architecture-Concept-Paper. This concept should take into account, that in my opinion services, which now run on the same KITODO-machine, should be de-coupled when using Docker. These are (e.g.) ElasticSearch, Database, LDAP. Additional, a clear file share concept is needed. And, also this concept needs to consider that each kind of configuration (e.g. kitodo-config.properties) must be outside the Docker container. ==> to summarize my comment: In my opinion, this concept needs to be done first, before any kind of Docker Setup can be done (and I believe that this will more effort, than estimated here above).

matthias-ronge commented 3 years ago

But that would mean that you would have to run many Docker images at the same time, and they would have to be able to find each other. To me, the suggestion makes little sense as many of these components interact so closely. None of the components are (in my experience) so resource-consuming that it makes sense to extract them on their own machine, not even in productive operation. (This would probably apply to generating image derivatives, that you do not name.) For a test system to play around a bit, it doesn't make any sense.

For extract file management, witch is something we discussed again and again in the past, I made a new ticket.

henning-gerhardt commented 3 years ago

Even as I did not need nor using Docker we should see the different use scenarios like

  1. local deployment for local testing
  2. running a (test) instance with little data (f.e. under 100.000 processes)
  3. running a (productive) instance with a lot of data (f.e. with more then 250.000 processes)

For 1. and 2. one docker instance could be enough. For 3. I could image to run separate services in separate docker instances.

At SLUB we are running our Kitodo.Production instances (productive, test, experimental) since 2005 on virtual machines which hosting different services like LDAP, database, data file access (WebDAV, Samba, ...), ElasticSearch, ... for earch instance complex. So we can add resources to this machines which need more resources and can them monitoring individually.

stweil commented 1 year ago

https://hub.docker.com/u/kitodo already exists (the current owner is UB Mannheim), so any Docker images could be provided there.

stweil commented 1 year ago

Should this issue be tagged for the development fund 2023?

solth commented 1 year ago

@stweil please feel free to assign the "development fund 2023" label to issues where you think they might be appropriate. If, for some reason, you do not have the permission to do so, let me know (the current Github configuration of your role in the Kitodo organisation suggests that you are already allowed to assign labels, but perhaps there is some other setting that prevents you from doing so)

stweil commented 1 year ago

Thanks for the hint. I was not aware that I had the necessary rights (for Kitodo.Presentation I don't have such rights).

stweil commented 1 year ago

https://hub.docker.com/u/kitodo already exists (the current owner is UB Mannheim), so any Docker images could be provided there.

Today Docker announced that they are "sunsetting Free Team organizations". Access "will be suspended on April 14, 2023 (11:59 pm UTC)." After that date https://hub.docker.com/u/kitodo is no longer available unless we upgrade to a non-free subscription (which I don't want to do, and I also don't suggest that Kitodo e. V. should do that).

Therefore there is now a new additional task: find another container registry for the Kitodo container images. I think using the service which is offered by GitHub might be the simplest solution, but it could also be hosted by any university / library which has its own installation of Gitlab or Gitea. And there exist a lot of other alternative container registries meanwhile.

stweil commented 1 year ago

As described by the initial proposal from @Kathrin-Huber and the comment from @stefanCCS there exist different options for the implementation of the "official Docker image".

  1. A single container with all required components, easy to run, suitable for demonstrations and testing of new features (low to medium costs)
  2. A modular approach with a small set of containers which handle specialized tasks and can be configured from outside, more difficult to run (might need configuration, requires docker compose), maybe even suitable for use in production environments (medium to high costs)

So it is necessary to decide whether option 1, 2 or both should be implemented.

Erikmitk commented 1 year ago

We already have a repo at the SLUB org where we are building a dockerized version of Kitodo.Production (+ it's infrastructure components if you need them). That might be a viable starting point for this issue.

https://github.com/slub/kitodo-production-docker/