OxalisCommunity / oxalis

Oxalis - PEPPOL Access Point open source implementation - Core component
Other
124 stars 90 forks source link

Dockerfile generalizations (Expose, Volume) #440

Closed darklajid closed 4 years ago

darklajid commented 4 years ago

Hi.

Setting up Oxalis for a cloud environment here. I have two requirements: I persist to a database (readonly filesystem) and listen to a "random" port. I'm aware that I can do that already today by my own¹, but I want to make a case for adapting the Dockerfile upstream to be more generic nevertheless: That way a shared binary artifact can be used globally.

1) The EXPOSE line

As discussed in #426: right now the image has a baked in hard-coded port. With closing #437 (thanks again!) we can now pass in environment variables into the oxalis.conf, so a dynamic oxalis.jetty.port set so something like ${PORT} is now possible - I can set it/control it from outside. But the exposed 8080 for the image is static.

I'd argue that EXPOSE isn't required in general². I would expect users of the image to use docker run -p to map the jetty port to a specific external port and that doesn't need the EXPOSE.

More discussions about different combinations of docker run invocations / EXPOSE settings here for example: https://stackoverflow.com/a/22150099/59492

The docker ecosystem understands that this is not nice and has a specific issue for EXPOSE being impossible to unset³ which was closed in favor or an issue tracking a more generic issue talking about changing/overriding/unsetting various parent properties⁴ (directly related to my next point).

2) The VOLUME line

Adding a VOLUME definition to the image is irreversible - using that image as base you won't be able to remove them afterwards. There are multiple discussions online for this elsewhere (Oracle: https://github.com/oracle/docker-images/issues/640 linking to https://forums.docker.com/t/why-use-the-volume-instruction/55420 for example). As above in 1) the docker ecosystem is aware of this problem⁴, but there is currently no solution other than 'ask upstream to change it' or 'fork the Dockerfile and build it yourself'.

The question here is if that HAS to be in the base image or could be either a runtime choice or be part of a derived image?

--

My proposal, if you allow: In an Ideal World² there would be an image on https://hub.docker.com/r/difi/oxalis that contains neither EXPOSE nor VOLUME definitions. This hypothetical image could be called oxalis-base maybe? The current images could still work as-is if they'd derive from that theoretical oxalis-base and literally just need the EXPOSE and VOLUME lines if those are actually required elsewhere. This way this issue would be fixed in an entirely backwards compatible and transparent way.

Again: The benefit I see would be that - for a hopefully minor change on the difi / oxalis side - downstream users would be able use the very same binary image everywhere. Today, if your infrastructure doesn't work well with EXPORT or VOLUME, you have to fork the Dockerfile, remove those definitions and build your own Oxalis.

I understand that this puts some (small?) amount of extra effort on difi. If you are at all open to the idea of splitting the build, I'd offer to work on a proposal / pull request to make this as painless as possible.

① So, I clearly understand a "WorksForMe" response :-p ② Obviously I'm biased / looking at this from my specific perspective. If you disagree I'd be glad to understand your use case / reasoning though ③ https://github.com/moby/moby/issues/2210https://github.com/moby/moby/issues/3465

darklajid commented 4 years ago

I created a pull request for discussion/consideration. With that Dockerfile and a docker-compose.yml (not included in the PR because excluded by the .gitignore - I assume you have one that is private already) like shown below, I can docker-compose build and end up with both an oxalis-base image and an oxalis image. The former is just the Oxalis distribution without EXPOSE, VOLUME and ENTRYPOINT - a 'library' or building block if you will for other docker images. The latter, 'oxalis', is functionally equivalent to the image distributed today.

docker-compose:

version: '3.7'
services:
  base:
    image: oxalis-base
    build:
      context: .
      target: oxalis-base
  default:
    image: oxalis
    build:
      context: .
      target: oxalis
klakegg commented 4 years ago

This is copied from the official Dockerfile reference (my highlighting):

The EXPOSE instruction informs Docker that the container listens on the specified network ports at runtime. You can specify whether the port listens on TCP or UDP, and the default is TCP if the protocol is not specified.

The EXPOSE instruction does not actually publish the port. It functions as a type of documentation between the person who builds the image and the person who runs the container, about which ports are intended to be published. To actually publish the port when running the container, use the -p flag on docker run to publish and map one or more ports, or the -P flag to publish all exposed ports and map them to high-order ports.

It is not clear to me why it is a problem to use EXPOSE with the default port used by Oxalis in the Dockerfile as recommended by the documentation as long as use of EXPOSE actually doesn't force the user of the Docker image to do anything in particular, and the user is free to use (almost) any port s/he wishes. Please clarify this part of your argument.

I'll look at VOLUME tomorrow.

darklajid commented 4 years ago

First of all: Thanks a lot for getting back! Let me explain myself some more then.

Sticking to EXPOSE first: The documentation states that you declare/document what your software is running on (1) and that it changes certain docker commands based on that information (2).

Arguably, EXPOSE 8080 is (1) flat out lying if Oxalis is configured to run on another port and (2) confusing docker and the infrastructure around it (docker run -P would expect something on 8080, the container metadata now declares something is running on 8080 .. and that is just not the case).

Now, you can add this metadata dynamically to the container at runtime: I can use docker run --expose 8081 if I wanted to - but I cannot 'unexpose' ports that you add to the base image. The Dockerfile is hard-coding something that is configurable, and if I configure this setting the Dockerfile is at least misleading or straight wrong - a container run from this base image will ALWAYS declare that something is listening on 8080.

To make this slightly less vague and add one random real world reason for my suggestion: If I want to run Oxalis on Google Cloud Run, then the Container Runtime Contract¹ requires me to listen on a dynamic port that the environment specifies as environment variable PORT - and it will not be (or: is not guaranteed/unlikely to be) 8080.

As for VOLUME: Two reasons.

The bigger one is: By declaring those, you basically change the behavior of docker for those folders and I cannot modify/change them in later stages anymore (trying to find a concise example I came across https://gist.github.com/cerisier/b14c0f79d0a6ba2332dc - showing what happens if base declares /var/www/html as a VOLUME). A Dockerfile starting with FROM difi/oxalis will silently break in various ways if you're not explicitly aware of these declarations - those folders are now 'magic' and weird.

The second one is a little more opinionated: VOLUME doesn't do that much (useful). Without extra runtime configuration (docker run -v ..) this creates anonymous and confusing implicit volumes (docker volumes list). If you require the user to configure these volumes properly when starting the container, then it seems of questionable value to have the technical issues (above): In the end the consumer of the image has to define volumes explicitly and .. there's nothing stopping someone from creating a volume with docker run -v .. for ANY folder, whether it is explicitly declared as VOLUME or not. The links in my original ticket are going into more depths, but I'd be glad to discuss this further here

It's similar to the EXPOSE thing, but worse. While the EXPOSE thing hard-codes something wrong-but-harmless, VOLUME hard-codes something confusing/surprising. Both of these are technically unnecessary, the former does very little on its own already and both can be instead specified during execution of the container.

I would humbly ask you to leave that to the user/consumer/host - either completely or, as in the pull request, as an option. Both of these things seem unneeded and can be easily defined at runtime, but cannot be undefined anymore when included in the base image.

Sorry for the wall of text. Summarizing my arguments:

Expose

Volume

https://cloud.google.com/run/docs/reference/container-contract

klakegg commented 4 years ago

I guess the problem is actually that it is possible to make the EXPOSE declaration lie then. The problem is related to this:

Server server = new Server(settings.getInt(JettyConf.PORT));

Changing it to this will solve the problem:

Server server = new Server(8080);

Oxalis don't need to know the port it is bound to, and removing EXPOSE will break the contact, as seen in the documentation.

darklajid commented 4 years ago

I'm not sure I understand. Removing the custom port functionality would make my argument about the EXPOSE irrelevant.. but for a big price in flexibility and potential customization.

Taking my Google Cloud Run example: Running Oxalis in that environment today means forking the Dockerfile and building it myself. If you'd remove the functionality to actually configure the jetty port, then I'd need to fork the Dockerfile AND Oxalis to undo that exact change locally: In that Cloud Run environment it is mandatory to be able to listen on some dynamic port that I don't know at docker build time.

But that is on me, of course: I can certainly maintain that on my part. I really am curious why you would rather keep the EXPOSE line than allowing a changing port (or - the split into two images as in the PR, which would leave difi/oxalis absolutely 100% the same).

Another theoretical use-case I could come up with would be adding nginx (or something similar) to the image as a reverse proxy, maybe I want to terminate SSL inside of the container for some reason. EXPOSE 8080 and hard-coding Oxalis/jetty on 8080 means that I cannot 'shadow' the service anymore. I can listen on 8443 or something with nginx, but the container will forever claim that it has a service listening on 8080 - even if my proxy design requires that this port won't be used: It's just an artifact of the base image.

Personally I'd hope that you don't remove that flexibility. I would've shot my own foot with this ticket (I got the environment variable support into Oxalis upstream, I actually can pass in a dynamic port and by making this ticket I .. made you remove that functionality in that case).

My hope was to remove the need for Oxalis customization by adding more flexibility to the upstream project. Quoting the installation documentation: "Oxalis is meant to be extended rather than changing the Oxalis source code." - this is what I am hoping for. With changes like the one in the PR, I can use the official artifacts from difi (be it the source as-is or even better: The binary container image) and just dump extensions/plugins into the folders Oxalis offers for this purpose.

darklajid commented 4 years ago

removing EXPOSE will break the contact, as seen in the documentation.

Apologies for another reply - I somehow read past this the first time. It seems you're saying that having no EXPOSE is wrong/bad? If I understand you correctly, then I don't believe that is the case.

EXPOSE, for all I can tell, has two reasons to exist:

1) You 'document' your port in the Dockerfile (that's my take on the docker documentation quote above). I assume there can be arguments for and against it - I'd say the Oxalis documentation is a better place for that and the local configuration file is the real source of truth.

2) Support docker run -P .. which takes all ports declared via EXPOSE and maps them to a random port on the host.

The second one is a use case for some cluster/swarm configurations as far as I can tell and was used for - say - the legacy/obsolete docker run --link¹: You'd EXPOSE ports that are then mapped to random ports on the host and consumed again via the magic of --link. But .. that use case is marked as to be removed in the future¹.

The very link you posted seems to mention the successor for these cases in the last sentences of the EXPOSE documentation:

The docker network command supports creating networks for communication among containers without the need to expose or publish specific ports, because the containers connected to the network can communicate with each other over any port. For detailed information, see the overview of this feature).

FWIW: This is what I get when I run the official oxalis image (including EXPOSE) with docker run -p 3333:8080 .. locally:

docker inspect -f "{{ .HostConfig.PortBindings}} - {{ .Config.ExposedPorts }}" oxalis_test
map[8080/tcp:[{ 3333}]] - map[8080/tcp:{}]

Running my own image based on the suggested oxalis-base (no EXPOSE in the oxalis base image nor in mine) with docker run -p 8181:8080 .. locally:

docker inspect -f "{{ .HostConfig.PortBindings}} - {{ .Config.ExposedPorts }}" angry_liskov
map[8080/tcp:[{ 8181}]] - map[8080/tcp:{}]

In other words: Both have explicit (my -p flag) port bindings as expected of course, but both ALSO have a Config.ExposedPorts section. It seemingly is auto-populated by EXPOSE in combination with the port bindings. Let's say I configure the official Oxalis image to run on 8181 and change my original invocation to docker run -p 3333:8181 ... Now I get this result in the metadata:

docker inspect -f "{{ .HostConfig.PortBindings}} - {{ .Config.ExposedPorts }}" oxalis_test
map[8181/tcp:[{ 3333}]] - map[8080/tcp:{} 8181/tcp:{}]

My binding implicitly adds 8181 to the exposed ports, but 8080 is still there because upstream declared it.

With all this being said: EXPOSE is by far the lesser issue. If you are set on keeping it in the image, please don't remove the dynamic port configuration. EXPOSE is a harmless nuisance for use cases like mine (and arguably others, given issue 437). The result is "wrong", but doesn't do anything other than being something like a digital appendix in the container metadata. The suggestion to drop this line is more a suggestion to make things simpler without (arguably) losing anything.

VOLUME on the other hand is actually a bigger deal and I need to remove those lines for my use, because those lines actually change the behavior of docker during build and runtime.

①: https://docs.docker.com/network/links/

darklajid commented 4 years ago

We're running into this again recently.

For now I'm still hoping that the PR #441 (splitting the container into two, one being just the binaries and the other being the Docker 'defaults' for VOLUME and EXPOSE) might be considered - but I still have to fork Oxalis and build my own images today. Is there any news about the potential of landing support for a docker image without VOLUME and EXPOSE by now? Thanks a lot for your consideration.