IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
876 stars 484 forks source link

Epic: small footprint container usable for development, testing and production purposes #5292

Closed poikilotherm closed 3 years ago

poikilotherm commented 5 years ago

IMHO this is an epic, not a single story.

This issue is successor to #5187 and closes it.

It is intended to serve as a base for solutions or make life easier in:

This is blocked by some stuff and relies on some prior art to be done:

Things to consider:


Vision / Proposal

Currently, when running integration tests or deploying Dataverse to Docker/Kubernetes, only kind of heavyweight solutions exist with the DockerAIO for IT tests and most (all?) Docker/Kubernetes/OpenShift approaches relying on the installer script.

I encourage the following vision:

  1. Build a new image directly from a Maven target, needing only a dev env plus Docker installed and running (obsolete once img support is in place...)
  2. Make this image as small as possible with an application server only, add dependencies and the application.
  3. Anything else lives in other containers, following the micro services credo.
  4. Make the application container itself stateless, also following the micro services credo. (This does not affect the use of volumes/...)
  5. Make the configuration a breeze - don't use the install script inside the container. Instead provide options to get the configuration inside from external sources.

To get there I suggest using:

Things to keep in mind:

  1. Ideally this is based on Payara 5, not 4.
  2. Let users still use the "old" WAR file approach in parallel! Somebody might rely on that. (That's why I killed #5187)
  3. Let the configuration ways currently know to all users still work. Somebody might rely on that!

Give it a shot! (Testing)

To test, just have Docker, Maven, Git and Java installed. Then do:

git clone https://github.com/poikilotherm/dataverse -b 5292-small-container
mvn -Pcontainer clean package docker:build docker:run -DskipTests

Please keep in mind that this is a feature branch. If you already have a cloned dataverse repo, you might better off using:

git remote add poikilotherm https://github.com/poikilotherm/dataverse
git fetch poikilotherm 5292-small-container
git checkout -b 5292-small-container

I regularly update this feature branch to be based on the latest develop. This involves rebasing, which will let your local branch be diverged. In that case, simply use git reset --hard poikilotherm/5292-small-container after a fetch.

poikilotherm commented 5 years ago

@pdurbin and other: initial work on the building part has been to my feature branch.

Right now this will (of course) not work. The Postgres driver is missing right now and the config part (see #5293) has to be addressed first.

As I wrote in the commit message, the upstream container project is in some parts not very responsive/active (see here, here, here and here).

Will try my best to get things upstream, but maybe better fork and try to get this upstream later:

Also wondering if automated builds and security scans from quay.io could be interesting for this.

poikilotherm commented 5 years ago

I am currently working on an updated payara image, see https://github.com/poikilotherm/docker-payaraserver-full/tree/refactor . Will try to get my work upstream, just a few hours ago they merged stuff :-D

Since the upstream merged a lot of stuff cleaning up most of the issues (see here), I switched back to use those. Some issues still exist with the init system.

poikilotherm commented 5 years ago

I just opened PR payara/docker-payaraserver-full#61 and hope things will get merged.

poikilotherm commented 3 years ago

Current niceness blockers:

But maybe we can just deal with it for now. Jenkinsfiles missing. Solr image missing.

poikilotherm commented 3 years ago

There is still plenty of stuff to do. Yet this issue is quite dated and it's starting to become messy. Closing this for now.

Note, that there is traction at https://github.com/gdcc/dataverse/tree/develop+ct, https://github.com/gdcc/dataverse-kubernetes and much MPCONFIG stuff going on.