grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.48k stars 3.85k forks source link

fix: Do not use default maven repo and stop propagating deps #11614

Closed honnix closed 1 month ago

honnix commented 1 month ago

Note that this is a breaking change because users now need to pull in those deps to their java_library, java_binary, etc.

A similar change happened in protobuf recently: https://github.com/protocolbuffers/protobuf/pull/18641

ejona86 commented 1 month ago

I can look over this more, but this seems very wrong. neverlink just opens people to runtime breakages, because those dependencies are needed. Neverlink is for optional dependencies and compile-only dependencies. It isn't for this.

There's no way it can work if all dependencies aren't merged together. -android vs -jre is handled exactly the same as with Maven, and those dependencies are exposed transitively. There's nothing special about Bazel here.

The linked comment mostly talks about dependencies for rules, not runtime-time dependencies. All the dependencies involved here are runtime dependencies. As the comment says:

The only reason to add to it is if you require users to use your dependencies at runtime for their own code

The change to Protobuf is mostly okay, as it is used for tests (still opens the door to test problems as duplicates can show up in the class path, but it doesn't impact you). And the main protobuf library doesn't have dependencies. But protobuf-util is broken. It can't use @protobuf_maven because that's going to duplicate different versions of shared deps at runtime. Specifically this one target; the others don't seem to matter: https://github.com/protocolbuffers/protobuf/blob/00f108c07807e73188f0baf51ba90ad527463f84/java/util/BUILD.bazel#L16

ejona86 commented 1 month ago

gRPC is quite different in this respect to Protobuf, as we don't have any tests running with Bazel. So all the dependencies are transitive dependencies, shy a rare neverlink=True dependency.

honnix commented 1 month ago

I understand the concern and I hope this work could make sharing a maven repo usable, which as it is today is very much not the case.

ejona86 commented 1 month ago

@honnix, in what way is it broken today? The useless "Found duplicate artifact versions" warning?

honnix commented 1 month ago

@honnix, in what way is it broken today? The useless "Found duplicate artifact versions" warning?

Yeah that's one of them, and that might not be just a warning if users set the level to error which is often desired.

Besides the warning, this approach forces users of grpc-java to always use the default maven repo (maven) that could also be accidentally chosen by other completely unrelated bazel modules, which results in unnecessarily complicated dependency resolving and could even give unintended/surprising resolution.

ejona86 commented 1 month ago

and that might not be just a warning if users set the level to error which is often desired.

If that happens then either a) the user is using an older version and it is being upgraded (and they could upgrade the version the specify) or b) the user is using a newer version. For (b), the only available solution I've seen with Bazel will produce the warning. And neverlink is not a solution as it breaks transitive dependencies.

The warning has always struck me as bizarre. It warns you when it is doing its job, which is to select the proper version. It has to do that already with transitive dependencies.

this approach forces users of grpc-java to always use the default maven repo (maven) that could also be accidentally chosen by other completely unrelated bazel modules

Really? If someone doesn't use the default name, then that's the same as if gRPC used a different name.

that could also be accidentally chosen by other completely unrelated bazel modules

So you want to purposefully break the common case (gRPC deps must be runtime) for a theoretical case? Do you think everyone should standardize on a non-default name like "maven_runtime" just to avoid the default?

I think the problem is that the warning is a warning, and duplicate_version_warning = error is fundamentally broken. I do accept debugging is harder with bzlmod, but the system is doing exactly what it needs to do.

honnix commented 1 month ago

Let's park the warning vs error question, since it is not critical.

Really? If someone doesn't use the default name, then that's the same as if gRPC used a different name.

Yeah, that is exactly the problem I mentioned in the PR description: the dependencies are forced into user space, regardless.

In old WORKSPACE setup, dependencies are declared but not forced, which means users could choose to take the dep list as is, or skip it and maintain their own to only include deps they need, and they could choose which maven repo to install them to. To me, it worked more like provided scope in maven, neither runtime nor compile: java_grpc_library claims that some deps are needed, but does not rely on grpc-java to install them automatically, and instead requires users to do so. This is the reason I think RJE extension design is not great and has room to improve.

So you want to purposefully break the common case (gRPC deps must be runtime) for a theoretical case?

It's not theoretical as I explained, but more like resembling how grpc-java WORKSPACE way worked. If I'm not mistaken, I think in maven case, the dependencies are also left for users to declare.

Do you think everyone should standardize on a non-default name like "maven_runtime" just to avoid the default?

Very good question and I guess it depends on use cases and also largely depends on how RJE could evolve. For pure internal/dev deps, they should absolutely be isolated in a non-default maven repo; for user facing deps, it should be clearly stated that the default maven repo must be used on user side (and this must be a standard followed by all modules because if anyone breaks it, there will likely be a classpath pollution). I hope RJE could come up with a better semantics.

ejona86 commented 1 month ago

which means users could choose to take the dep list as is, or skip it and maintain their own to only include deps they need, and they could choose which maven repo to install them to.

We do not support that. We don't support downgrades and we can add/change dependencies between versions. That is asking for bugs when upgrading (java.lang.Errors only found at runtime!).

To me, it worked more like provided scope in maven, neither runtime nor compile

Which we aren't using/supporting.

java_grpc_library claims that some deps are needed, but does not rely on grpc-java to install them automatically, and instead requires users to do so.

Because java_grpc_library both generates the code and creates the java_library, there's no way for the user to specify their own deps. And even if there was, that wouldn't work for transitive dependencies (java_proto_library in dependency Bazel repositories).

If I'm not mistaken, I think in maven case, the dependencies are also left for users to declare.

No. Maven and Gradle handle transitive deps. Maven does do it in a broken way that trivially downgrades dependencies, but we tell all users to use requireUpperBoundDeps which detects when Maven makes poor decisions.

For pure internal/dev deps, they should absolutely be isolated in a non-default maven repo

I agree with that. Although Java testing deps will probably need to leak through, because the upstream project you depend on needs to avoid multiple versions of deps in tests.

it should be clearly stated that the default maven repo must be used on user side (and this must be a standard followed by all modules because if anyone breaks it, there will likely be a classpath pollution)

Pollution? There's no pollution. The worst that happens is you get a newer version at runtime. But you need that anyway, because downgrading dependencies is generally unsupported and fragile.

If you don't use a particular dependency (the target that uses it isn't part of your binary), then you will pay a cost as its dependency chain is looked up in Maven Central. But it doesn't change your binary other than version bumps of deps you actually depend on.

honnix commented 1 month ago

We do not support that. We don't support downgrades and we can add/change dependencies between versions. That is asking for bugs when upgrading (java.lang.Errors only found at runtime!).

It could be upgrade. For example, users may choose to use a later version of Guava, instead of the one introduced by grpc-java.

No. Maven and Gradle handle transitive deps. Maven does do it in a broken way that trivially downgrades dependencies, but we tell all users to use requireUpperBoundDeps which detects when Maven makes poor decisions.

I meant the protoc compiler and grpc-java plugin do not introduce deps to user space, but for sure that is due to pure code generation than compiling, and I agree it works differently in the bazel rule.

Pollution? There's no pollution. The worst that happens is you get a newer version at runtime. But you need that anyway, because downgrading dependencies is generally unsupported and fragile.

It's actually pretty bad. Let me give an example. Suppose I have project depending on bazel module A and B, where A defines a maven repo maven with maven artifact foo-1.0, and B defines a maven repo repo_b with maven artifact foo-2.0, and obviously foo-1.0 and foo-2.0 are not compatible. In my project, I also use maven repo. After module resolution, I will have @@rules_jvm_external++maven++maven which includes both my deps and deps from A, and @@rules_jvm_external++maven++repo_b including deps from B. If I use a rule from B that introduces runtime deps foo-2.0, and at the same time I have all my deps and deps from A also included, I will end up with having @@rules_jvm_external++maven++maven//foo and @@rules_jvm_external++maven++repo_b//foo in my classpath, because maven and repo_b are resolved separately. This is exactly the trouble I had when I did not use maven and at the same time using java_grpc_library.

I'm fine if this is not a supported case and I can patch the rule, but what I want to bring to the table is that, the way RJE module works is not great and it requires in depth knowledge for users to use it correctly.

ejona86 commented 1 month ago

It could be upgrade. For example, users may choose to use a later version of Guava, instead of the one introduced by grpc-java.

That can be done without issue using the shared @maven repo. The user just adds the version they want and if it is greater than the version used by gRPC or a transitive dependency then it is used.

I meant the protoc compiler and grpc-java plugin do not introduce deps to user space

Not directly. But both require using a runtime that is at least as new as the generator. (protoc X is only supported with protobuf-java X or later.) There is a dependency, even if it is less typical. (This would still work fine if we used @maven for protobuf/grpc java code instead of building directly with Bazel.)

where A defines a maven repo maven with maven artifact foo-1.0, and B defines a maven repo repo_b with maven artifact foo-2.0, and obviously foo-1.0 and foo-2.0 are not compatible

If they are completely incompatible then they should have different Maven names and java packages (e.g., junit 4 vs junit 5 don't have the same name in Maven Central).

and B defines a maven repo repo_b

... you seem to be arguing my point. You need a single resolution of dependencies, otherwise you'll get duplicates. All deps going into a classpath need to use a single maven repo.

And note this is even worse if foo-1.0 depends on Guava X and foo-2.0 depends on Guava Y, as then you'll get both versions of Guava in your classpath.

Now it is certainly a problem if foo-1.0 and foo-2.0 are completely incompatible and have the same name in Maven. But that's a problem no matter what; you can't combine them together no matter what. If foo-2.0 is at least partly compatible with foo-1.0, then the resolution will pick 2.0 and that's the best anyone can do.

honnix commented 1 month ago

Fair enough. I think we can park here and see how RJE iterates. Thank you for the great discussion.

BTW there is another thing I want to mention, these overrides forces users of java_grpc_library to have source code dependencies on grpc-java and protobuf, instead of maven artifacts, if they also use maven repo (actually RJE has a bug merging all overrides from all modules, regardless of maven repo, but even merging overrides from the same maven repo could cause this). I guess this could be by design as well.

ejona86 commented 1 month ago

The protobuf overrides are needed "Because java_proto_library uses @@protobuf+//:java_toolchain which has a runtime dependency on //java/core". So you have to use the Bazel-built jars and any references in Maven have to get mapped.

It's basically the same for the gRPC targets: java_grpc_library uses "" which has a runtime dependency on //compiler:java_grpc_library_deps__do_not_reference which brings in the gRPC dependencies.

I've been mostly following Bazel's lead for protobuf because even if it has problems, doing my own thing just adds more problems. We could have //compiler:java_grpc_library_deps__do_not_reference depend on the proper @maven target and then let users do their own overrides. That's a poor choice at the moment because it means you can't then really use Bazel-built jars. In theory if we moved java_grpc_library out to its own repo then grpc-java could retain the overrides and you only depend on grpc-java if you want to build grpc-java. But the plugin complication breaks that assumption.

honnix commented 1 month ago

@@protobuf+//:java_toolchain

Do you mean @com_google_protobuf//:protoc? Yeah that is a whole set of different problem, which I think https://github.com/aspect-build/toolchains_protoc/ might help.

ejona86 commented 1 month ago

No, I meant the toolchain. It is an alias to //java/core:toolchain, which brings in :core to the runtime. So when you use java_proto_library you have a transitive runtime dependency on protobuf's //java/core. So we add the override to make sure that all Maven usages of protobuf also use the same Bazel-built jar.