containerbuildsystem / cachi2

Cachi2 is a CLI tool that pre-fetches your project's dependencies to aid in making your build process network-isolated.
GNU General Public License v3.0
7 stars 25 forks source link

Support caching java dependencies for maven projects #664

Open pierDipi opened 4 days ago

pierDipi commented 4 days ago

I'm trying to have hermetic builds for a Java project using Maven in Konflux, however, since cachi2 doesn't support caching maven dependencies, the only reasonable way I found was to create an intermediate image that downloads dependencies [1] but that doesn't pass the default enterprise contract since using intermediate images causes the error Base image "xyz" is from a disallowed registry.

Here is the PR with proof of concept using the intermediate image https://github.com/openshift-knative/eventing-kafka-broker/pull/1273

[1] https://github.com/openshift-knative/eventing-kafka-broker/blob/e1355b833093404b5e5e13f5a7bba1fc241cf49c/openshift/ci-operator/static-images/dispatcher/konflux/Dockerfile.deps


Potential Solution

To cache dependencies in Maven, that also supports multiple modules, we need to:

pierDipi commented 4 days ago

I see a different approach here https://github.com/containerbuildsystem/cachi2/pull/663 that would work for gradle projects too

kosciCZ commented 3 days ago

Hey @pierDipi, how well would you say #663 covers your use case? Is there anything that could be changed to make it work better for your use case (e.g. not to be too much hassle to set up when fetching entire projects)?

(disclaimer: not a cachi2 maintainer, just trying to see if #663 could be improved upon)

pierDipi commented 3 days ago

My only problem with it is that as it is I don't know if we have existing tooling or will provide one to create that custom "lock file".

The project we're trying to build has like 230 dependencies (including transitive ones and it's a relatively small/medium size project), so without a companion tool to help with that it becomes very tedious to create and maintain that lock file over time and, at the same time, I wouldn't want every team to create their own bespoke tool.

kosciCZ commented 3 days ago

My only problem with it is that as it is I don't know if we have existing tooling or will provide one to create that custom "lock file".

The short answer is no, not currently, and most likely not in the future.

The project we're trying to build has like 230 dependencies (including transitive ones and it's a relatively small/medium size project), so without a companion tool to help with that it becomes very tedious to create and maintain that lock file over time and, at the same time, I wouldn't want every team to create their own bespoke tool.

I definitely agree with this sentiment.

While I can see how #663 could be used for your use case, I think it is not a 100% match. The feature, as I understand it, is more for fetching one-off artifacts from maven, when fetching an entire build (or all its dependencies) is inefficient or costly.

Again, not a cachi2 maintainer, or a java expert in any way, but I can possibly see your use case as a separate cachi2 package manager, if that's a typical way for a java project to be structured and set up for a hermetic build.

aloubyansky commented 2 days ago

For Java, there is no reliable alternative to capturing all the necessary dependencies for a build besides running the build. So there is that as the first step in prefetching.

We could generate a lockfile from a Maven repo content on disk except in some edge cases it will be challenging to determine the proper values of classifiers, versions and types based on filenames. An alternative approach could be to create a Maven extension that would register a repository listener to listen to artifact resolution events and record each artifact resolved. However there could plugins that initialize their own resolves and registering repository listeners. An extra check may need to be done to make sure all the artifacts in a local repo have been properly captured in the lockfile.