bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.96k stars 4.02k forks source link

Replace dependencies in WORKSPACE file #5252

Open meteorcloudy opened 6 years ago

meteorcloudy commented 6 years ago

Related doc: TensorFlow Bazel Build Issues

Bazel users sometimes want to replace an external dependencies in WORKSAPCE with a local one.

For example: I want to use Eigen installed at /opt/local/eigen-3.3.4 to replace @eigen_archive. This is achievable in Bazel, but not easy due to two reasons:

Updates:

/cc @aehlig @dkelmer @dslomov

davido commented 6 years ago

Lack of documentation of --override_repository option

Interesting, I wasn't awared of this option. We have similar requirement in Gerrit to consume JGit from development tree instead of from Maven-Central.

Here is the instruction: https://github.com/GerritCodeReview/gerrit/blob/master/Documentation/dev-bazel.txt#L360-L365

And the corresponding code is here: https://github.com/GerritCodeReview/gerrit/blob/master/lib/jgit/jgit.bzl#L11-L13

It would be great, if we could simplify this setting and pass that option from the command line instead editing source build file in tree.

meteorcloudy commented 6 years ago

@davido Yes, this is exactly use case, can you give it a try? :)

davido commented 6 years ago

@meteorcloudy Yes, will look into replacing it and comment how it's going.

davido commented 6 years ago

I'm seeing that passing --override_repository="jgit_lib=/home/davido/projects/jgit/org.eclipse.jgit" does have some effect, but I'm afraid it's not that trivial, because what we are currently doing depending of building against Maven artifacts or local repository, is this mapping:

# Uncomment to build against local repository
LOCAL_JGIT_REPO = "/home/davido/projects/jgit"

def jgit_repos():
  if LOCAL_JGIT_REPO:
    native.local_repository(
        name = "jgit",
        path = LOCAL_JGIT_REPO,
    )
  else:
    jgit_maven_repos()

def jgit_maven_repos():
    maven_jar(
        name = "jgit_lib",
        artifact = "org.eclipse.jgit:org.eclipse.jgit:" + _JGIT_VERS,
        repository = _JGIT_REPO,
        sha1 = "265a39c017ecfeed7e992b6aaa336e515bf6e157",
        src_sha1 = "e9d801e17afe71cdd5ade84ab41ff0110c3f28fd",
        unsign = True,
    )
    maven_jar(
        name = "jgit_servlet",
        artifact = "org.eclipse.jgit:org.eclipse.jgit.http.server:" + _JGIT_VERS,
        repository = _JGIT_REPO,
        sha1 = "0d68f62286b5db759fdbeb122c789db1f833a06a",
        unsign = True,
    )
    maven_jar(
        name = "jgit_archive",
        artifact = "org.eclipse.jgit:org.eclipse.jgit.archive:" + _JGIT_VERS,
        repository = _JGIT_REPO,
        sha1 = "4cc3ed2c42ee63593fd1b16215fcf13eeefb833e",
    )
    maven_jar(
        name = "jgit_junit",
        artifact = "org.eclipse.jgit:org.eclipse.jgit.junit:" + _JGIT_VERS,
        repository = _JGIT_REPO,
        sha1 = "6f1bcc9ac22b31b5a6e1e68c08283850108b900c",
        unsign = True,
    )

def jgit_dep(name):
  mapping = {
      "@jgit_junit//jar": "@jgit//org.eclipse.jgit.junit:junit",
      "@jgit_lib//jar:src": "@jgit//org.eclipse.jgit:libjgit-src.jar",
      "@jgit_lib//jar": "@jgit//org.eclipse.jgit:jgit",
      "@jgit_servlet//jar":"@jgit//org.eclipse.jgit.http.server:jgit-servlet",
      "@jgit_archive//jar": "@jgit//org.eclipse.jgit.archive:jgit-archive",
  }

  if LOCAL_JGIT_REPO:
    return mapping[name]
  else:
    return name

And the wrapper of JGit libraries utilizing jgit_dep mapping function, like this:

load("//lib/jgit:jgit.bzl", "jgit_dep")

java_library(
    name = "jgit",
    data = ["//lib:LICENSE-jgit"],
    visibility = ["//visibility:public"],
    exports = [jgit_dep("@jgit_lib//jar")],
    runtime_deps = [":javaewah"],
)

So, I think it would be really tricky to make it work without this mapping layer and only using --override_repositoryoption, if at all possible.

Previously I tried to use config_setting for this purpose: #2707.

tgamblin commented 6 years ago

I may be a bit confused here, as I am not too familiar with Bazel, but as someone who just wants to integrate TensorFlow into a distro, I want to tell the TF build where my existing installation of Eigen (or another library) lives -- and I just want to provide the unix prefix where it is installed. That is essentially the result of cmake --D CMAKE_INSTALL_PREFIX=$prefix && make && make install.

I am not clear on what the semantics of --override_repository are, but it sounds like that expects a source repository, or maybe a build output directory, not an installation prefix. Is that accurate? The layout for those can be quite different.

Also, as a user, writing a BUILD file sounds quite tedious, especially if I do not know Bazel. Shouldn't the BUILD file be provided by the project (e.g., TensorFlow), so that whoever runs the build just has to supply a location for the dependency?

meteorcloudy commented 6 years ago

@davido I see. If I understand correctly, the problem is that the layouts of jgit_maven_repos and local jgit repository are different, therefore we not only need to use --override_repository option, but also have to change the target name. Is that correct?

meteorcloudy commented 6 years ago

@tgamblin The semantics is --override_repository=<repo_name>=<repo_path>. It expects a BUILD file under which defines the same targets as the original external repository.

I agree ideally users should not be asked to provide a BUILD file for their installations like /opt/local/eigen-3.3.4. But currently there's no tool to auto-generate a BUILD file for that.

And yes maybe TensorFlow could provide the BUILD file, but if the override repository has a different layout of the original repo, we not only have to provide a BUILD file, but also need to change the target path (or name). I think this is the problem @davido encountered.

davido commented 6 years ago

@meteorcloudy

If I understand correctly, the problem is that the layouts of jgit_maven_repos and local jgit repository are different, therefore we not only need to use --override_repository option, but also have to change the target name. Is that correct?

Yes, somehow we need to implement that mapping:

mapping = {
      "@jgit_junit//jar": "@jgit//org.eclipse.jgit.junit:junit",
      "@jgit_lib//jar:src": "@jgit//org.eclipse.jgit:libjgit-src.jar",
      "@jgit_lib//jar": "@jgit//org.eclipse.jgit:jgit",
      "@jgit_servlet//jar":"@jgit//org.eclipse.jgit.http.server:jgit-servlet",
      "@jgit_archive//jar": "@jgit//org.eclipse.jgit.archive:jgit-archive",
  }

The reason is obvious - cardinality mismatch between repositories used in WORKSPACE for JGit (N) and JGit local repository (1) - JGit projects has multiple maven artifacts. Each artifact is represented as one external repository in bazel (and one maven_jar rule in WORKSPACE file). The Bazel build implementation in JGit corresponds to the Maven artifacts, see https://git.eclipse.org/r/#/c/84527.

To add to this complexity we need source artifact as well, because we patch some JGit code for diff code paths, that why you see both jar and source classifiers in the map for jgit_lib library:

mapping = {
[...]
      "@jgit_lib//jar:src": "@jgit//org.eclipse.jgit:libjgit-src.jar",
      "@jgit_lib//jar": "@jgit//org.eclipse.jgit:jgit",
[...]
tgamblin commented 6 years ago

@meteorcloudy

I agree ideally users should not be asked to provide a BUILD file for their installations like /opt/local/eigen-3.3.4. But currently there's no tool to auto-generate a BUILD file for that.

I think you either need to a) create a way to auto-generate a BUILD file for an external project (seems really hard to me), or provide a way for project implementors like TensorFlow to provide BUILD files describing what external dependencies should look like. The overhead of creating the build file is going to be way too high for most distro packagers to consider maintaining Bazel-built stuff.

Requiring everything to have a BUILD file is also a huge burden for project maintainers, as they have to know how to build all their dependencies with Bazel. I don't see how this is scalable. TF basically owns all of its dependencies and their builds, and it depends on very low level aspects of them (specific artifacts and internal headers).

And yes maybe TensorFlow could provide the BUILD file, but if the override repository has a different layout of the original repo, we not only have to provide a BUILD file, but also need to change the target path (or name). I think this is the problem @davido encountered.

I talked to @martinwicke about this, and he seemed to think it would be fine if the TF build switched the way they handle dependencies to be more compatible with external installs. Specifically, a TF build directory might look like this:

project/
    src/
    dependencies/
        repo1/
            BUILD
        repo2/
            BUILD
        repo3/
            BUILD

AFAICT, the TF build currently builds each dependency repo in-place, and the BUILD files expose artifacts from the dependency to the project.

I think you want something more like this:

project/
    src/
    dependencies/
        repo1/
        repo2/
        repo3/
        install1/
            BUILD
        install2/
            BUILD
        install3/
            BUILD

Now, TF builds repoN and installs it to installN using its native build system (or Bazel, I suppose). Each BUILD file describes the layout of installN. Conveniently, the layout of installN is also what you would find in an external installation like /opt/local/eigen-3.3.4. The BUILD file should be usable BOTH for an internal, Bazel-controlled build AND for an external installation of the dependency. The TF team would need to maintain the BUILD file and make sure it's compatible with their own dependency builds and external installations.

You could add a --override_install option so that users can indicate that they are providing an installation prefix and not a repository.

@martinwicke seemed to think it wouldn't be overly cumbersome for TF's Bazel build to first build, then install each external dependency like this. Maybe he can comment.

FWIW, we do something similar to this in Spack, which you might be interested in. See the External Packages section of our docs. Basically, users can specify a prefix in a YAML file, and they can tell Spack as much as they know about it (version of package, compiler, compiler version, etc.). Users can choose whether they want to build certain packages themselves or provide an existing installation. It gets a little messy for system prefixes (which are shared and may contain other libraries that can pollute the build) and for packages where the system layout may be different depending on how they were installed, but it generally works.

meteorcloudy commented 6 years ago

@tgamblin Thanks for the recommended solution, but let me clarify a bit how bazel build works. Bazel doesn't build everything in-place, but generates binaries under <workspace_root>/bazel-bin directory, which is a symlink to some directory under your user output root. For example, when you build //example/cpp:hello-world, the output binary is ./bazel-bin/example/cpp/hello-world.exe. You can run bazel info bazel-bin to see where exactly bazel-bin is located.

This means, Bazel doesn't touch your source tree, so there's no way to install the build for repoN back to src/dependencies/installN. If you do this outside of Bazel, it will make the build complicated and not hermetic.

Instead, how about we provide two BUILD files for each repo:

project/
    src/
    dependencies/
        repo1/
            source.BUILD
            installed.BUILD
        repo2/
            source.BUILD
            installed.BUILD
        repo3/
            ...

Both BUILD files should define the same set of targets, so that when switching the dependencies, all targets are still available.

We can then replace the dependencies in the following two way:

  1. Use new_local_repository in workspace.bzl to replace the original external repo. For example:
    native.new_local_repository(
     name = "eigen_archive",
     path = "/usr/include/eigen3",
     build_file = "//third_party/eigen/installed.BUILD",
    )
  2. Introduce a --override_repository_with_build_file=<repo_name>;<repo_path>;<build_file_path> option, basically it does the same thing as new_local_repository, but in a command line, so it's easier for users.

I actually tried to use new_local_repository to replace some dependencies of TF with local installations and custom BUILD files.

  1. Licenses files are missing under installation directory. They are referenced by https://github.com/tensorflow/tensorflow/blob/d09304fca478b8db5c651eec6fb2559844a73373/tensorflow/tools/pip_package/BUILD#L110. But this problem could be solved by providing some dummy targets in installed.BUILD

  2. The artifacts of a installation are not under same directory.

    • zlib: zlib.h is under /usr/include, libz.a is under /usr/lib/x64_64-linux-gun
    • swig: swig is under /usr/bin, template files are under /usr/share/swig3.0 This means we have to point the root of the repository to their common parent directory, in this case, meaning /usr. As it basically contains everything, we have to be very careful when using glob in its BUILD file, and it might cause other problems.
  3. Installation version are different from the original repository TF uses. I got the following errors because a different version is used:

    • eigen: Missing TensorTrace.h
    • zlib: # error ZLIB_VERNUM != PNG_ZLIB_VERNUM \
    • protobuf: 'GeneratedClassName' was not declared in this scope

    Some kind of version check should also be implemented to address this problem.

FYI: @dslomov @aehlig

meteorcloudy commented 6 years ago

@dslomov @aehlig Do we have a plan to address this issue?

dslomov commented 6 years ago

@aehlig and I looked at this for some time and here is a potential general solution for replacing source dependencies with pre-built ones. Let us know what you think.

First of all, we will need a repository rule (maybe local_cc_dependency or somesuch) that is able to reference a locally installed library. Example usage:

local_cc_dependency(name = "zlib",
    # a map between names in repository directory and absolute paths
    headers = { "zlib.h" : "/usr/incude/zlib.h" }
    libs = { "libzlib.a" : "/usr/lib/libzlib.a" }
)

This rule will populate the repository directory with symlinks to prebuilt files and generate a suitable BUILD files with cc_imports so that the prebuilt library can be used from Bazel.

Then TF will structure their WORKSPACE file as follows. There will be an almost empty file local_overrides.bzl in TF sources:

def local_overrides(): 
   pass

WORKSPACE file is structured so:

...
load(":local_overrides.bzl", "local_overrides")
local_overrides()

_maybe(git_repository, name = "zlib", urls = [....])
....

where _maybe is

def _maybe(repo_rule, name, **kwargs):
  if name not in native.existing_rules():
    repo_rule(name=name, **kwargs)

During the normal build, when local_overrides.bzl is empty, TF will just use the source versions.

Distro maintainers can replace the local_overrides.bzl file with references to prebuilt libraries:

def local_overrides(): 
   local_cc_dependency(name = "zlib",    
                                      headers = { "zlib.h" : "/usr/incude/zlib.h" }
                                      libs = { "libzlib.a" : "/usr/lib/libzlib.a" }
  )

and TF build will use those instead of source dependencies. (Of course, that file should be autogenerated from package descriptions).

WDYT? (None of these require any changes in Bazel, but we can help with writing local_cc_dependency rule; of course TF build will need to be massaged as well)

dslomov commented 6 years ago

@irengrig fyi

meteorcloudy commented 6 years ago

@perfinion FYI, since you are working on this. https://groups.google.com/a/tensorflow.org/forum/#!topic/build/f_w_xmpk-w8

perfinion commented 6 years ago

Huh cool, so I've implemented a thing that is similar in ways to parts of how @meteorcloudy and @dslomov did things. My stuff is here: https://github.com/perfinion/tensorflow/commits/unbundle

I used a regular cc_library instead. Ideally I'd like some kind of pkg_config_library that would automatically search pkg-config. But until then we can just hardcode the linker flags manually.

I originally tried doing something similar to this below but realized the headers= part is not necesary, the compiler will always search /usr/include by default since thats one of the system paths. Its not exactly hermetic but the downsides of hardcoding the paths is a ton of extra maintenance work tracking paths across different distros. Not only that, /usr/include is only correct if you are building for the host. Cross compiling frequently has a different dir that the compiler will look for correctly. (eg /usr/aarch64-unknown-linux-gnu/usr/include/). Also things should be linking to the .so file not the static lib.

local_cc_dependency(name = "zlib",
    # a map between names in repository directory and absolute paths
    headers = { "zlib.h" : "/usr/incude/zlib.h" }
    libs = { "libzlib.a" : "/usr/lib/libzlib.a" }
)

swig was one of the more complicated BUILD files since it needs to execute the binary. I ended up just making a rule to copy it into the bazel-bin and run it from there. https://github.com/perfinion/tensorflow/blob/unbundle/third_party/systemlibs/swig.BUILD That is pretty sub optimal, it should just be able to run it from /usr/bin/swig. Copying worked in this case but some binaries use other resources relative to where they are (eg python virtualenv stuff) or SELinux labels may be needed in which case copying would just fail to run at all.

About configuring the overrides, I'd rather not have just a blank local_overrides() function. The tooling to make system linked builds definitely needs to be up in the main tensorflow repo itself. Otherwise distro maintainers are going to have to maintain a ton of stuff separately from tensorflow which makes coordination complicated and new people packaging tensorflow may not find the work from other distros and would end up duplicating all the effort.

I quite like something like this, its defintely very easy to use: --override_repository_with_build_file=<repo_name>;<repo_path>;<build_file_path> If we have generic enough BUILD files for the system lib versions then they should mostly just work across most distros.

One problem I ran into is that I dont know how to make an if_system so I can nop out the license files. Or is there a way to make just a blank rule so the pip_package BUILD file can still query for it but just gets nothing?

meteorcloudy commented 6 years ago

@perfinion Thanks for the feedback!

Excluding the license files should be easy. You can consider the following way:

  1. Add a config_setting in tensorflow/BUILD:
    config_setting(
    name = "use_system_libs",
    values = {"define": "use_system_libs=true"},
    visibility = ["//visibility:public"],
    )
  2. Add a macro in tensorflow/tensorflow.bzl:
    def if_not_use_system_libs(a):
    return select({
      clean_dep("//tensorflow:use_system_libs"): [],
      "//conditions:default": a,
    })
  3. Use if_use_system_libs in tensorflow/tools/pip_package/BUILD:
    load("//tensorflow:tensorflow.bzl", "if_not_use_system_libs")
    ...
    COMMON_PIP_DEPS = if_not_use_system_libs([":licenses"]) +
                  ...
  4. When building with system installed libs, pass --define=use_system_libs=true option to enable the config_setting.

You can also use the select statement directly if you don't want to write the macro. More details is available here: https://docs.bazel.build/versions/master/be/general.html#config_setting

perfinion commented 6 years ago

@meteorcloudy Thanks, that intro helped me finally understand the conditionals, I'd seen them scattered around but it didn't quite click. The system libs stuff can't be an all-or-nothing toggle since there are still a ton more to unbundle on top of the ones I've done so far. I'm having trouble reconciling the two tho.

I currently use an ENV var like this (https://github.com/perfinion/tensorflow/commit/3a667f7bcd739dd95ae093d041aef2eb82f74bbf#diff-2115a4210ba6ff522adacf81a857d671R40)

# Checks if we should use the system lib instead of the bundled one
def _use_system_lib(ctx, name):
  syslibenv = _get_env_var(ctx, "TF_SYSTEM_LIBS")
  if syslibenv:
    syslibs = list(syslibenv.split(","))
    if name in syslibs:
      return True
  return False

Then .tf_configure_bazelrc can contain: build --action_env TF_SYSTEM_LIBS="com_googlesource_code_re2,nasm,jpeg,png_archive,org_sqlite,gif_archive,six_archive,astor_archive,termcolor_archive,pcre,swig,curl,grpc,lmdb,zlib_archive,snappy,flatbuffers,cython,jemalloc"

That worked really easily in _tf_http_archive to just put the downloading inside an if block. But the above needs ctx so this won't work:

def if_not_use_system_libs(name, a):
  if _use_system_lib(ctx, name):
    return []
  else:
    return a

Is there a way to change my original _use_system_lib to use the --define instead of --action_env then I wouldn't need the context I think? I could of course make a million different --define=use_syslib_pcre=true --define=use_syslib_swig=true etc but that seems tedious

I found a cleaner way to deal with most of the license files. A blank filegroup worked to define the target that just doesn't do anything. This works for most of the targets since they are eg "@gif_archive//:COPYING",. But it fails for "@grpc//third_party/nanopb:LICENSE.txt", since the swapping only makes a single top level BUILD file so i cant match on the lower dirs inside grpc//third_party/nanobp so I still need the if_not_use_system_lib() to handle those more awkward ones.

filegroup(
    name = "COPYING",
    visibility = ["//visibility:public"],
)
gonzojive commented 1 year ago

I asked about this on https://stackoverflow.com/questions/75127243/how-can-i-optionally-override-repository-dependencies-in-bazel/75127299#75127299 and was directed here.

What about the possibility of an "optional load" function that can be used in a WORKSPACE.bazel file. Pseudo code:

if load_if_exists(":local_workspace.bzl", "install_overrides"):
  install_overrides()
github-actions[bot] commented 5 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

meisterT commented 2 months ago

@meteorcloudy, what is the status of this with bzlmod?

meteorcloudy commented 2 months ago

This is about build switch between system dependencies and fetched dependencies, still an issue with Bzlmod.