Closed davido closed 7 years ago
This incompatibility was introduced in a47cc063d39b3958f31b4c5a992741020fd9ee3e. Workaround is to consume the external dependencies with rules_closure
specific java_import_external
rule, in projects that depend on rules_closure
.
This workarond seems to be quite intrusive, as it forces projects that depend on rules_closure
to rewrite some parts of the build tool chain from "@foo//jar" to "@foo". Moreover, for non trivial projects, like Gerrit Code Review, that consume hundreds of external dependencies, and that has hundreds of own plugins that consumes the Gerrit plugin API, and thus transitively depends on Gerrit dependencies, it could be very challenging, when some of the hundreds of external dependencies (like rules_closure
in this specific case) would force Gerrit Code Review project itself to replace substantial parts of the build tool chain, by replacing the consumption of maven_jar
with java_import_external
.
Moreover, in case of custom maven_jar
rule implementations, some feature are lost, like aggresive caching introduced by Gerrit Code Review maven_jar
bazlet: [1], that is a backport from Buck's maven_jar
bucklet implementation: [2].
IOW, it would be great, if a47cc063d39b3958f31b4c5a992741020fd9ee3e could be rewritten, so that it remains compatible to maven_jar directory layout.
java_import_external
isn't rules_closure specific. The only thing it does is it generates a BUILD file that has a java_import
rule. That rule has the same name as the repository. @foo
is the shortened version of @foo//:foo
. Except it has the deps
attribute set, so you don't need a third party directory where java_library
rules express the dependency relationships using exports
and runtime_deps
.
It's not possible for us to rewrite rules_closure, Nomulus, TensorFlow, etc. to have Bazel invoke the Maven command from within Bazel. That would require everyone to install Maven before building. It would most likely be quadratic across repositories. It would not be able to take advantage of the Bazel downloader which is necessary in order to meet the reliability requirements of projects like TensorFlow.
It's not possible for us to rewrite rules_closure, Nomulus, TensorFlow, etc. to have Bazel invoke the Maven command from within Bazel.
I'm not asking to re-write java_import_external
using Maven. What I'm saying is that it could generate BUILD file in jar
directory, like maven_jar
is doing. So that @foo//:foo
would become @foo//jar
. In fact it was very recently changed in rules_closure
from maven_jar
to java_import_external
in https://github.com/bazelbuild/rules_closure/commit/a47cc063d39b3958f31b4c5a992741020fd9ee3e , to support multiple URLs. Before this commit the native maven_jar
rule was used and it was compatible with gerrit way to consume external dependencies.
BTW, the README.md is stale, and still claiming that it could be overridden with:
load("@io_bazel_rules_closure//closure:defs.bzl", "closure_repositories")
closure_repositories(omit_guava=True)
maven_jar(
name = "guava",
artifact = "...",
sha1 = "...",
)
For one, guava
was replaced with canonical artifact name, for another, and this is what this issue is all about, it wouldn't be compatibe with native or custom maven_jar
rule. Liek the code orgnized now, it cannot be overridden with maven_jar, but only with rules_closure own java_import_external
.
Real world rules_closure
integration example: [1].
load("@io_bazel_rules_closure//closure/private:java_import_external.bzl",
"java_import_external")
#maven_jar(
# name = "javax_inject",
# artifact = "javax.inject:javax.inject:1",
# sha1 = "6975da39a7040257bd51d21a231b76c915872d38",
#)
closure_repositories(
omit_javax_inject=True
)
java_import_external(
name = "javax_inject",
jar_urls = [
"http://bazel-mirror.storage.googleapis.com/repo1.maven.org/maven2/javax/inject/javax.inject/1/javax.inject-1.jar",
"http://repo1.maven.org/maven2/javax/inject/javax.inject/1/javax.inject-1.jar",
"http://maven.ibiblio.org/maven2/javax/inject/javax.inject/1/javax.inject-1.jar",
],
jar_sha256 = "91c77044a50c481636c32d916fd89c9118a72195390452c81065080f957de7ff",
licenses = ["notice"], # Apache 2.0
)
I would expect, that I would be able to say just that:
maven_jar(
name = "javax_inject",
artifact = "javax.inject:javax.inject:1",
sha1 = "6975da39a7040257bd51d21a231b76c915872d38",
)
closure_repositories(
omit_javax_inject=True
)
and this would just work.
[1] https://gerrit-review.googlesource.com/#/c/105310/4/WORKSPACE@14
I like your thinking on the backwards compatibility. I can modify java_import_external to generate a jar/BUILD file with an alias() rule pointing to the new target. I'll do that right now. However, please be advised that the alias target is not going to have the same behavior, because it will have things like dependency relationship information. I'm not sure if that will cause problems. But just take it into consideration.
As for the README, right now it's written for the 0.4 release series and not HEAD. But we're not using HEAD because a Google project like Gerrit is going to want to stay in sync with the internal repo.
Change now mailed out internally. It will be approved by your friend Kasper and then pushed to this repository within a few days probably.
@jart Thank you.
Change now submitted. Here's the config for the latest Closure Rules:
http_archive(
name = "io_bazel_rules_closure",
sha256 = "af1f5a31b8306faed9d09a38c8e2c1d6afc4c4a2dada3b5de11cceae8c7f4596",
strip_prefix = "rules_closure-f68d4b5a55c04ee50a3196590dce1ca8e7dbf438",
urls = [
"http://bazel-mirror.storage.googleapis.com/github.com/bazelbuild/rules_closure/archive/f68d4b5a55c04ee50a3196590dce1ca8e7dbf438.tar.gz", # 2017-05-05
"https://github.com/bazelbuild/rules_closure/archive/f68d4b5a55c04ee50a3196590dce1ca8e7dbf438.tar.gz",
],
)
Let me know if there's any other adjustments you need me to make.
If I read f68d4b5a55c04ee50a3196590dce1ca8e7dbf438 correctly, it would allow clients to consume external dependency through java_import_external
without adjusting the path from @foo//jar
to @foo
.
However, f68d4b5a55c04ee50a3196590dce1ca8e7dbf438 would still not fix the incompatibility between maven_jar
and java_import_external
, because when the external dependencies are consumed with maven_jar
, then they are exposed only (!) under the @foo//jar
path, whereas rules_closure
is still expecting them to show up in its build process under the @foo
path.
I see two ways to resolve this incomatibility:
maven_jar
would need to define an alias under the root folder, to make rules_closure
happy, so that the path @foo
is definedrules_closure
would need to revert this part of a47cc063d39b3958f31b4c5a992741020fd9ee3e and consume all external dependencies under the path: @foo//jar
, instead of @foo
.I went ahead and extended our custom maven_jar
rule with this first suggested approach above. I've sent this CL: [1] to Kasper and other Gerrit build tool chain reviewers.
The [1] allows gerrit project to avoid using java_import_external
but still reuse common dependencies with rules_closure
rules:
[...]
closure_repositories(
omit_aopalliance = True,
omit_args4j = True,
omit_javax_inject = True,
)
[...]
maven_jar(
name = "javax_inject",
artifact = "javax.inject:javax.inject:1",
sha1 = "6975da39a7040257bd51d21a231b76c915872d38",
)
[...]
Needless to say, that other maven_jar
incarnation, most notably the native and the Skylark ones, not affected by this fix and would still fail to interact with rules_closure
.
It worth noting though, that the transitive dependencies feature of java_import_external
, by that I mean passing deps
list to it, is another serious problem that would make solving the interoperability problem between maven_jar
, that is not aware of such depedencies and java_import_external
really hard problem to solve.
That why, even with [1] in place gerrit project can currently only reuse 3 external dependencies:
Another complication is different naming convention between gerrit an rules_closure
: simple names vs. canonical artifact name. For example jsr305
cannot be re-used:
You had me super confused when you were talking about maven_jar. Because I think of native.maven_jar, but it seems Gerrit has a Skylark maven_jar.
It's impossible to be fully compatible with native.maven_jar for the precise reason you concluded. Definitions written using java_import_external only specify direct dependencies, just like java_library and java_import do. Even if the naming issues were worked out, the dependency graph still wouldn't exist.
Since you're adding a root directory alias to Gerrit's maven_jar, the only difference now between that and java_import_external is you're using a Python downloader script from your Buck days, because Buck never had a downloader.
Here's my sales pitch for java_import_external. You get to use the most advanced downloader that exists. The cost is a slightly more verbose config. In exchange you gain the following advantages:
Those are all things that curl currently isn't doing for you. Out of curiosity, are you using something like Squid, and if not, how often does your CI system flake due to download errors? You can trade that in for lightning fast performance and >99.9% reliability.
Literally nothing is better at downloading files than Bazel.
the custom maven_jar rule was added to have a cross-project cache for downloaded files. It's possible this is no longer needed.
Actually, it's more than that:
[...]
maven_jar = repository_rule(
attrs = {
"artifact": attr.string(mandatory = True),
"sha1": attr.string(),
"src_sha1": attr.string(),
"_download_script": attr.label(default = Label("//tools:download_file.py")),
"repository": attr.string(default = MAVEN_CENTRAL),
"attach_source": attr.bool(default = True),
"unsign": attr.bool(default = False),
"deps": attr.string_list(),
"exports": attr.string_list(),
"exclude": attr.string_list(),
},
local = True,
implementation = _maven_jar_impl,
)
Beside cross-project and cross-clone cache, it also offers the following feature:
$HOME/.m2
# TODO(davido): Remove exlusion of file system provider, when this issue is fixed:
# https://issues.apache.org/jira/browse/SSHD-736
maven_jar(
name = "sshd",
artifact = "org.apache.sshd:sshd-core:1.4.0",
exclude = ["META-INF/services/java.nio.file.spi.FileSystemProvider"],
sha1 = "c8f3d7457fc9979d1b9ec319f0229b89793c8e56",
)
Still @jart main point in last comment was to replace curl
based downloading with Bazel own download
function. I think it should be doable. We would still need to preserve the current features that our custom version of maven_jar
is offering.
That is correct. Using redundant URLs with a GCS bucket you control means 99.9% reliability both for yourselves and your users with zero Squid config. So long as you don't mind paying for the bandwidth of your CI hitting the GCS bucket. It's a very tiny cost compared to SWE hours when builds break.
@davido java_import_external supports neverlink
and srcjar
, just like java_import. Our internal tool (soon to be external) also generates configs for it. As for $HOME/.m2
I'm assuming that's being used as a cache. Consider the new downloader to a zero-configuration alternative.
So long as you don't mind paying for the bandwidth of your CI hitting the GCS bucket. It's a very tiny cost compared to SWE hours when builds break.
The most often CI breakages that we have seen in the past were due to outage of GH. Our Polymer tool chain is depending on GH 100% availability. AFAICT, there is no mirror NPM repository for that on GCS bucket.
Our internal tool (soon to be external) also generates configs for it.
Cool and thanks for pointing that out. Can you ping me, when you release that tool? I would like to have a look.
As for $HOME/.m2 I'm assuming that's being used as a cache.
Actually we are using it for short circuit upgrading dependent libraries during Gerrit development process. Gerrit has more than 150 dependent libraries. The upgrade process is often not trivial. As example, I am regularly building the libraries from head, like gwtorm
, JGit
, Gitiles
, Prolog Caffee
, SSHD
, GWT
, to name a few, to prepare the upgrade. The process is, fetch the library from head, build (and patch it if needed), then publish the built/patched artifact to $HOME/.m2
by using mvn install
and tweak the Gerrit build process to consume patched library from local maven repository, by saying:
maven_jar(
name = 'gwtorm',
artifact = 'gwtorm:gwtorm:42',
repository = MAVEN_LOCAL,
)
This process is even documented in dev-bazel documentation upstream.
It worth noting through, that at least for JGIt
that is a project under Eclipse foundation and thus must be built and released using Maven toolchain, we have added Bazel driven build and support alternatively consuming the locally built JGIt
artifacts from the source by using Bazel's local_repository
rule. In ideal world, all dependencies would be built with Bazel that we could consume them from the source and avoid local Maven polishing step in the first place.
For non trivial project some of the
rules_cosure
project dependencies can already be consumed with native or custommaven_jar
rule.However, this leads to the incompatibility in how the directory structures are generated:
compare to directory structure generated by
maven_jar
rule:As the consequence, the dependencies cannot be found, as there is no
BUILD
file generated by themaven_jar
rule in the package root location, e.g.:For example I need to apply this diff: [1] to convince bazel 0.4.5 to build gerrit code review project against rules_closure from master.