fabric8io-images / s2i

OpenShift S2I images for Java and Karaf applications
Apache License 2.0
70 stars 84 forks source link

Take advantage of multi-stage builds and modular JDK images #180

Closed gunnarmorling closed 5 years ago

gunnarmorling commented 5 years ago

Hey @rhuss, over the weekend I've been musing how the "ideal" Dockerfile for a Java app would look like. My requirements essentially are these:

You can find the Dockerfile I came up with here: https://github.com/debezium/debezium-examples/pull/42.

It's doing three things:

The first and second stages do only "heavy" work if needed, i.e. if the POM has changed (for the dependencies that's done automatically by virtue of leveraging mvn dependency:gooffline, for the latter it's a manual process). It's using the Fedora minimal image as base for the resulting image. So the size and rebuilding penalties are quite neat:

So my question is: can we achieve something comparable with S2I? If so, how? If not, could we make it happen? Let me know what you think, looking foward to hearing from you :)

vorburger commented 5 years ago

@gunnarmorling I think you're raising a number of (interesting) points here! :smile: Re. modularity, #181 ?

rhuss commented 5 years ago

Yeah, multistage Docker builds are very helpful but as you guess, they are not really suitable for the way how S2I works. The same is true for JIB which use also a layered approach for resources, dependencies and project classes. This works even completely without Docker daemon and has very good performance characteristics.

S2I answer for reusing already download dependencies are 'incremental' builds. This works by copying in over some parts from a previous build into the current image. The S2I images here take care that the .m2 directory are reused in an incremental build. However, in early tests (maybe two years ago?) it turned out that the copy over the Maven repo took as nearly as long as downloading the deps afresh. Maybe its worth to revisit incremental builds again ?

Finally, @nicolaferraro and @lburgazzoli are working on a sophisticating caching of various builder images with deps included which they use in Camel K for super fast build times for similar builds. I don't know the details in depth, but maybe we could leverage that here as well for a generic s2i approach ?

Also, an interesting article is https://blog.sonatype.com/improving-build-time-of-java-builds-on-openshift where Jorge is showing some other strategies to increase build performance of Java S2I build.

gunnarmorling commented 5 years ago

Yeah, multistage Docker builds are very helpful but as you guess, they are not really suitable for the way how S2I works.

Admittedly that makes me wonder whether S2I isn't then getting more in the way than that it helps ;) Multi-stage builds are super-useful, so if a tool is blocking us from using them, it's a bit of pity.

S2I answer for reusing already download dependencies are 'incremental' builds.

I had looked into that, but the problem is that incremental builds drastically increase the resulting image size (the Maven repo gets added). That's bothering, esp. as the Maven repo is a build-only artifact which isn't required at application runtime. That's exactly the beauty of the multi-stage builds: they allow us to keep apart the artifacts of the different lifecycle phases (build vs. runtime).

Finally, @nicolaferraro and @lburgazzoli are working on a sophisticating caching of various builder images with deps included

But what is there to work on, if multi-stage builds already provide a very practical solution? Or is it about integrating multi-stage builds into S2I?

Also, an interesting article...

I had a quick look, but instead of throwing more tools (Nexus) at the issue I'm wondering whether we can't come up with something more simple.

In fact, I'm curious why I should even use S2I instead of that rather simple Dockerfile. Which btw. I also can use with local testing in Docker, Compose etc. One reason surely is that the Docker strategy isn't available with OS Online, hence I'm so eager whether we can adjust the S2I process to take advantage of all this.

Sorry should I sound a bit negative, it's just that I think there's so much potential to make things smoother here for folks out there, so let's do it :)

rhuss commented 5 years ago

Admittedly that makes me wonder whether S2I isn't then getting more in the way than that it helps ;) Multi-stage builds are super-useful, so if a tool is blocking us from using them, it's a bit of pity.

Tbh, I'm not sure whether multi stage builds are included as a kind of standard, so not sure that other buildsystems using Dockerfiles as their definition format also support multi stage builds. 'would be interesting to check out whether multi stage builds are usable outside the Docker universe.

rhuss commented 5 years ago

But what is there to work on, if multi-stage builds already provide a very practical solution? Or is it about integrating multi-stage builds into S2I?

As mentioned above, I suspect that multi-stage builds require a Docker daemon and can't be created with other OCI compliant build systems. As the Kubernetes ecosystem is moving away from Docker to introduce a Docker-only feature would be a blocker IMO. Also, e.g. Minishift and the latest supported Docker daemon for OpenShift don't even support multi-stage builds yet.

rhuss commented 5 years ago

In fact, I'm curious why I should even use S2I instead of that rather simple Dockerfile. Which btw. I also can use with local testing in Docker, Compose etc. One reason surely is that the Docker strategy isn't available with OS Online, hence I'm so eager whether we can adjust the S2I process to take advantage of all this.

If you want to use OpenShift ImageStreams and build with OpenShift you have to use S2I for now, but depending on the momentum on knative-build, there might be soon alternatives.

rhuss commented 5 years ago

Sorry should I sound a bit negative, it's just that I think there's so much potential to make things smoother here for folks out there, so let's do it :)

No problem ;-) I don't think Docker multi-stage builds are technical a bad thing, its just that I see Docker support in platforms like Kubernetes or OpenShift on the decline (and as mentioned multi-stage support even never made it into the docker daemon used by OpenShift, and I suspect it will arrive in OpenShift land).

gunnarmorling commented 5 years ago

Gasp, I wasn't aware that the Docker daemon in OS doesn't support Dockerfiles with multi-stage builds. Thanks for pointing it out, @rhuss.

Regarding multi-stage builds themselves, I don't think they are inherently bound to the Docker deamon, at least in theory you could apply the same pattern with other image builders, too. After all, you're just taking the output created by one build as input for another. In fact, the other day I learned about OpenShift's chained builds, which look like that. So this might actually be the answer, I'll try and see whether I can build a complete example with the Java S2I builder.

On knative-build, I'm monitoring that, too. Might indeed be a viable alternative some time soon.

vorburger commented 5 years ago

@gunnarmorling @rhuss just FYI re this old discussion here, I learnt today from @siamaksade over in https://github.com/quarkusio/quarkus/issues/304 about OpenShift Chained builds. You may know about this already (I somehow missed this previously), but if you don't, have a look, it would let you something quite like Docker multi stage build today.

As for things a little bit more in the future: I believe we are moving from the old version of Docker used in today's OpenShift to podman with buildah, which supports multi stage in a Dockerfile; I've used this St and alone already, it will eventually find its way into OpenShift.

And Knative build has build steps. I may or may not be able to get more into that in the coming weeks.

rhuss commented 5 years ago

@vorburger yeah, I know the chained S2I builds (and its even described in the "Image Builder" pattern in out "Kubernetes Pattern" book, see k8spatterns.io ;-)

You find the example from the book here: https://github.com/k8spatterns/examples/tree/master/advanced/ImageBuilder/openshift

vorburger commented 5 years ago

This issue was more of a conceptual discussion about something which this project can't really "deliver" or "fix".

https://quarkus.io/guides/openshift-s2i-guide documents an example how to set up a chained build.

@gunnarmorling @rhuss let me therefore close this old issue, to clean up this project a little bit - hope OK for you.