jboss-dockerfiles / wildfly

Docker image for WildFly project
https://quay.io/repository/wildfly/wildfly
MIT License
278 stars 236 forks source link

Mark /opt/jboss/wildfly/standalone/{data,tmp} as data volumes #22

Open pszemus opened 9 years ago

pszemus commented 9 years ago

Wouldn't it be a good idea to mark /opt/jboss/wildfly/standalone/{data,tmp} as data volumes (1) ? They would bypass the Union File System causing a direct write, smaller diffs (2) and smaller container dir size.

(1) https://docs.docker.com/reference/builder/#volume (2):

$ docker diff 0a
...
C /opt/jboss/wildfly/standalone/tmp
A /opt/jboss/wildfly/standalone/tmp/vfs
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/s3-1.8.0.jar-2878891b64f5f965
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/s3-1.8.0.jar-2878891b64f5f965/contents
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/s3-1.8.0.jar-2878891b64f5f965/s3-1.8.0.jar
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/sts-1.8.0.jar-11bf88e6628dc597
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/sts-1.8.0.jar-11bf88e6628dc597/contents
A /opt/jboss/wildfly/standalone/tmp/vfs/deployment/deploymentfe48288eca471f39/sts-1.8.0.jar-11bf88e6628dc597/sts-1.8.0.jar
....
goldmann commented 9 years ago

Yes, that's an idea. But everyone needs to be aware that if change something like this - these files will be stored on the disk always, even when the container was already removed. It's kinda sad that there is no easy way to list/remove these files.

rwngwn commented 9 years ago

Personally, I don't like that idea. 1) With volumes you will have to introduce some mechanism to remove that files from host (eap sucks at cleaning its tmp dir - especially when killed by SIGKILL :) ) 2) These data are related to container and should just be removed when containers dies. 3) I think that spawning image to container should be considered "pure function" so have as low observable side effects ad possible - so volumes for me are just often miss-used. 4) If you really need that you can just base on our image and add volumes, or just mount volumes in run-time with -v of docker run switch (your orchestration tool should be able to do that) 5) I don't know what is UnionFS issue but if I use btrfs storage driver i cant see difference, and i think that image shouldn't introduce any storage driver specific features without very strong reasoning

smarterclayton commented 9 years ago

I was going to ask to add the volume dir because then OpenShift will automatically provision empty volumes through Kube when the image runs. It is a shame Docker can't/doesn't properly manage the volumes.

goldmann commented 9 years ago

This: https://github.com/docker/docker/pull/14242 may be relevant to the discussion too.

lucasbasquerotto commented 2 years ago

This issue is old but is still opened, tough I consider that it can be closed (together with the PR https://github.com/jboss-dockerfiles/wildfly/pull/52), because defining volumes in the base image could break things for the consumers (and there is no way to turn back that change in the child images).

If someone wants to add a volume, they can do so by specifying the volume when running the container.

As for the reasons against VOLUME in the base image, I will quote what I wrote in the issue https://github.com/jboss-dockerfiles/wildfly/issues/39:

Regarding introducing VOLUME in the Dockerfile, I'm not entirely clear if you are just asking a question about declaring the volumes in your images, or declaring it in the base image.

If it's to declare VOLUME in the base image (generated by this repository), it would be extremely bad because it would force everyone to use it. It could potentially create a lot of anonymous volumes without the user knowing it, and it could also break things because it's not clear for the user that a volume is created (unless the user inspects the container).

I had already a good deal of trouble because of base images with volumes.

In your example, VOLUME /opt/jboss/wildfly/standalone would already break the image for us, because we include the WAR in the deployments folder in our CI environment, providing the same base image for our staging and production environments (the environment specific configuration ir read from an external file, present in the environment). If made into a VOLUME, everything included at that path will be discarded (not included in the resulting image).

Even if the path is defined in a location that does not affect us directly, it could still cause problems about creating lots of anonymous volumes (unless we define it as a non-anonymous volume beforehand, which would make the declaration in the base image unnecessary). It could also make the container behave different than expected (for example, if I run the container without defining volumes explicitly, I would expect that removing and recreating it would clear the state, but anonymous volumes would break it, making bugs and and the cause for wrong behaviour of the application harder to spot.

Just to be clear, I'm not against using volumes in those paths. If someone wants volumes, they can easily add them in their docker run or docker-compose files, or in their k8s configuration. Some documentation about best practices when running the image, with instructions about using volumes in thos paths would be ok too.

But, please, don't include VOLUME in the base image.

Update

I included this post here mainly for reference (regarding including VOLUME in the base image), after all, this issue is more than 5 years old, but I considered good to include it here because the issue is still open.