bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.95k stars 4.02k forks source link

Runfiles: support paths with spaces #4327

Open laszlocsomor opened 6 years ago

laszlocsomor commented 6 years ago

Update: this bug no longer claims to be Windows-specific, see https://github.com/bazelbuild/bazel/issues/4327#issuecomment-468274037


Fix this TODO: https://github.com/bazelbuild/bazel/blob/7b423ccd9506c6fb500b5c4998e1f26aebf28912/src/main/java/com/google/devtools/build/lib/windows/runfiles/WindowsRunfiles.java#L54-L55

izzyleung commented 6 years ago

Hmm, this problem also applies to other platforms as well, say macOS, when running bazel build //... on a workspace living under a directory with name containing space(s), bael will tell me:

ERROR: bazel does not currently work properly from paths containing spaces (com.google.devtools.build.lib.runtime.BlazeWorkspace@5dc63968).
laszlocsomor commented 6 years ago

@izzyleung thanks for the info! In that case I think P2 or even P3 is safe. I thought this worked on other platforms.

laszlocsomor commented 6 years ago

Update: I haven't looked into fixing this issue and I'm most likely not going to for the next couple months.

laszlocsomor commented 5 years ago

Dropping priority. Spaces aren't well supported on Linux either. It'd be nice if they were fully supported, but this doesn't seem like a critical feature.

Demo time:

Directory structure:

  $ find .
.
./WORKSPACE
./sub
./sub/a b
./sub/a b/bar
./sub/a b/bar/x.txt
./bin.sh
./BUILD

BUILD file:

sh_test(
    name = "bin",
    srcs = ["bin.sh"],
    data = glob(["sub/**"]),
)

genrule(
    name = "gen",
    outs = ["gen.txt"],
    srcs = glob(["sub/**"]),
    cmd = "( for f in $(SRCS); do echo $$f; done ; ) > $@",
)

bin.sh:

#!/bin/bash
echo "Hello test"
find .
false

Space aren't supported in data dependencies:

  $ bazel build :bin
INFO: Analysed target //:bin (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /tmp/aaaaaaaa.aaa/runfile_spaces/BUILD:1:1: Creating runfiles tree bazel-out/k8-fastbuild/bin/bin.runfiles failed: build-runfiles failed: error executing command 
  (cd /usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/b8ecdd60a1b4498304771172bdb140cf/execroot/__main__ && \
  exec env - \
    PATH=/usr/local/google/home/laszlocsomor/bazel-releases/current/bin:/usr/local/google/home/laszlocsomor/bazel-releases/current/bin:/usr/lib/google-golang/bin:/usr/local/buildtools/java/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  /usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/install/4cfcf40fe067e89c8f5c38e156f8d8ca/_embedded_binaries/build-runfiles bazel-out/k8-fastbuild/bin/bin.runfiles_manifest bazel-out/k8-fastbuild/bin/bin.runfiles): Process exited with status 1: Process exited with status 1
/usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/install/4cfcf40fe067e89c8f5c38e156f8d8ca/_embedded_binaries/build-runfiles (args bazel-out/k8-fastbuild/bin/bin.runfiles_manifest bazel-out/k8-fastbuild/bin/bin.runfiles): link or target filename contains space on line 3: '__main__/sub/a b/bar/x.txt /tmp/aaaaaaaa.aaa/runfile_spaces/sub/a b/bar/x.txt'

Target //:bin failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.187s, Critical Path: 0.01s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

Spaces don't play well with genrule's $(SRCS).

  $ bazel build :gen && cat bazel-genfiles/gen.txt
...
INFO: Build completed successfully, 1 total action
sub/a
b/bar/x.txt
dslomov commented 5 years ago

Adding team-Core as it is crosscutting through Skyframe-related things

flaub commented 5 years ago

This issue is cropping up when I try to use an in-build py_runtime. Assuming I have a way of acquiring a python installation (I'm using conda in my case), it so happens that it will have filenames with spaces in them (for pretty core packages like setuptools).

Here's an example:

BUILD

py_runtime(
    name = "py3_runtime",
    files = ["@conda//:all_files"],
    interpreter = "@conda//:python",
    python_version = "PY3",
)

py_runtime_pair(
    name = "py_runtime_pair",
    py3_runtime = ":py3_runtime",
)

toolchain(
    name = "py_toolchain",
    toolchain = ":py_runtime_pair",
    toolchain_type = "@rules_python//python:toolchain_type",
)

@conda//BUILD

package(default_visibility = ["//visibility:public"])

exports_files(["env"])

filegroup(
    name = "all_files",
    srcs = glob(["env/**"]),
)

filegroup(
    name = "python",
    srcs = ["env/bin/python"],
)

I have a conda-based repository_rule which fetches the conda environment into the env directory of the @conda repository.

OK, now let's run something, we get the following error:

ERROR: /***/tools/analysis/BUILD:25:1: Creating runfiles tree bazel-out/darwin_x86_64-opt/bin/tools/analysis/analysis_test.runfiles failed: build-runfiles failed: error executing command
...
/var/tmp/***/install/8772b695a4a7b498a4dd3eff54505d9c/_embedded_binaries/build-runfiles: link or target filename contains space on line 20923: '***/external/conda/env/lib/python3.7/site-packages/setuptools/script (dev).tmpl /private/var/tmp/***/79803ee528d2751e2a32d6192f74c66d/external/conda/env/lib/python3.7/site-packages/setuptools/script (dev).tmpl'

My workaround at the moment is to exclude filenames with spaces in them in the all_files filegroup, but this means I'm getting some partial python environment.

ohler commented 5 years ago

I am running into this problem, and in my case, an easy fix (I think) is to change the runfiles manifest file format to use some character other than space as a separator. I briefly tested this (using TAB as separator) and it seems to be working, but I didn't test in much depth since I wanted to ask here first if such a change would be acceptable. (The main challenge was bootstrapping Bazel with a change to the manifest file format, but I got that to work, too.)

Would it be OK to change the separator in manifest files to another character? If so, what character? A single-byte character other than NUL or \n would be the smallest change.

Keeping a character reserved would remain a limitation, and would be a regression for anyone who depends on that character to work; but in practice, it would be a step forward since spaces in filenames are much more common than TAB or 0xff or whatever separator we choose.

laszlocsomor commented 5 years ago

We could use a different character than space, and adjust code that reads manifest files to be backwards-compatible, i.e. check for the new character first, then fall back to splitting on space.

laszlocsomor commented 5 years ago

More things rely on the current manifest format than I anticipated:

If we updated the manifest format, the change should be as little disruptive as possible. Preferably not break anything, i.e. code that expects the current format should still work.

I think we could keep space as the separator and use another character only when an entry has space in it. We could then update manifest consumers one by one. Old consumers would still work with new Bazel as long as no file has spaces in them (and so the delimiter is space), but they would break with files names with spaces because they'd split on the first space and not the alternative delimiter, but that's OK because they fail today already anyway.

ohler commented 5 years ago

Thanks for working on this! That compatibility feature of leaving manifest entries without spaces in the paths unchanged seems like the easiest way to get Bazel bootstrapping to work (when I was testing, I got the impression that, during bootstrapping, Bazel produces manifests with one version and then reads them with another; but I could be mistaken).

An alternative approach that is also compatible would be to replace embedded spaces with the new reserved character. That way, existing code unprepared for the new format would still find the right split points (since the delimiter would remain a space), and merely have the wrong paths. Making sure the top-level structure doesn’t get misinterpreted would make things a little easier to reason about. (A concrete benefit is that error messages from code that doesn’t understand the new format will perhaps be less confusing; although perhaps it’s also obvious enough if paths get truncated at the first space.)

One advantage of introducing a new separator (as in your PR) is that we can eventually switch over to using that exclusively, whereas replacing spaces would remain more complex. However, if we don’t plan to do phase out spaces as separators (and instead want to preserve compatibility indefinitely), I would lean towards replacing spaces.

But also: Have you considered just biting the bullet and replacing spaces with an escape sequence like \40? That gives us an obvious way to allow any character.

laszlocsomor commented 4 years ago

Thanks for the suggestions!

An alternative approach that is also compatible would be to replace embedded spaces with the new reserved character.

Interesting idea!

Current PR's approach is using the special delimiter only if the link path has space:

Your idea seems like the better approach overall, the convincing argument is the single lookup -- I adjusted the PR, let's see if tests pass.

laszlocsomor commented 4 years ago

If anyone's following: I'm working on this bug now.

gregmagolan commented 4 years ago

Can the "Runfiles: support space in file paths" https://github.com/bazelbuild/bazel/pull/9351 PR be resurrected?

I ran into this again in rule_nodejs. Some npm packages contain files with spaces and these are excluded from runfiles as it is not currently supported by Bazel.

Angular team wants to use puppeteer to manage the chrome binary version for web testing in their repo. This npm package downloads and extracts the chrome binary to node_modules/puppeteer/.local-chromium and some of the files for chrome on OSX contain spaces. For example, node_modules/puppeteer/.local-chromium/mac-706915/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Versions/79.0.3945.0/Chromium Framework. I'm thinking of work-arounds to this but the simplest solution would be for Bazel to support paths with spaces in runfiles.

meisterT commented 4 years ago

WindowsRunfiles doesn't exist anymore.

cristifalcas commented 4 years ago

This is not windows specific. It's happening on linux as well.

ulfjack commented 4 years ago

The flag --experimental_inprocess_symlink_creation may provide a workaround for this issue. The main caveat is that it can't recover from a file or directory with bad mod bits (compared to the tool-based strategy). I also haven't looked at its performance implications (it may be slower).

Edit: Correct the flag name.

mrmeku commented 4 years ago

The flag seems to work well from my experiments with it so far! I'll write about any incompatibilities I find along the way

marmos91 commented 4 years ago

@gregmagolan I am facing the same issue with Electron on OSX (it uses Chromium under the hood, so it is just the same problem). Did you found a workaround I can take inspiration from?

ulfjack commented 4 years ago

@marmos91 Did you see the last two updates on this bug?

marmos91 commented 4 years ago

@marmos91 Did you see the last two updates on this bug?

@ulfjack do you mean the flag --experimental_inprocess_symlink_creation? If so, I tried it without luck :(

ulfjack commented 4 years ago

Can you say what happened when you tried it?

carpenterjc commented 3 years ago

I have just run into this writing a sanity test for a windows installer using py_test and a data dependency on the install fileset. In my case the bazel root doesn't contain spaces, but the file set under test does courtesy of Program Files.

An example line:

myapp/Installer/package/program files/mycompany/my.exe C:/dev/_bazel/xxxxxx/execroot/myapp/bazel-out/x64_windows-fastbuild/bin/Installer/package/program files/mycompany/my.exe

I have tried --experimental_inprocess_symlink_creation when I look at the runfiles dictionary via

r = runfiles.Create()
print(r._strategy._runfiles)

I can see the last entry in the manifest split on the first space after program before files. So that option doesn't fix the problem for me. I would see this: {"myapp/Installer/package/program" : "files/mycompany/my.exe"}

I notice that if the workspace root doesn't contain a space, then you always get an odd number of spaces in the line. In which case I wondered about prototyping a fix which splits on the middle space in the line.

carpenterjc commented 3 years ago

Patching runfiles.py:179 with the following code works for me.

  @staticmethod
  def _LoadRunfiles(path):
    """Loads the runfiles manifest."""
    result = {}
    with open(path, "r") as f:
      for line in f:
        line = line.strip()
        if line:
          tokens = line.split(" ")
          if len(tokens) == 1:
            result[line] = line
          else:
            result[" ".join(tokens[0:int(len(tokens)/2)])] = " ".join(tokens[int(len(tokens)/2):])
    return result

Same algorithm should work across many languages.

HackAttack commented 3 years ago

The latest version of Boost (1.75) contains a header with a space in it, which is how I was bitten by this issue. boostorg/serialization#221

songyang-dev commented 3 years ago

How is the progress on this issue? This is the only thing preventing me from considering bazel on Windows.

devversion commented 2 years ago

Disclaimer: Just some hint/trick that might be useful as a workaround for some folks. This still needs to be fixed.

I just wanted to post what we have been doing in the majority of Bazel projects for rules_webtesting compatibility (where the Chromium darwin binaries contain a space). In short: The runfile space error only seems to occur when the runfile build links are being created. With the --nobuild_runfile_trees flag the runfile tree creation is deferred until actually needed (this had CI cache improvements with RBE for us as well).

The flag solves the problem nicely (for us at least) because the runfile trees are not actually created on Darwin where the sandbox strategy does not rely on runfile build links at all (more details on this here: https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df)

# Do not build runfile forests by default. If an execution strategy relies on runfile
# forests, the forest is created on-demand. See: https://github.com/bazelbuild/bazel/issues/6627
# and https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df
# Note: This also works around an issue where Bazel does not support spaces in runfiles. The
# Chromium app files for Darwin contain files with spaces and this would break. For darwin though,
# the sandbox strategy is used anyway and runfile forests are not needed.
# Related Bazel bug: https://github.com/bazelbuild/bazel/issues/4327.
build --nobuild_runfile_links
maxgerhardt commented 2 years ago

I am still running into this in the very latest Basel 6.0.0 pre-release :(

WARNING: Output user root "C:/Users/Max Gerhardt/_bazel_Max Gerhardt" contains a space. This will probably break the build. You should set a different --output_user_root.
ERROR: bazel does not currently work properly from paths containing spaces (C:/users/max gerhardt/downloads/examples-main/cpp-tutorial/stage1).
Sync finished
Sync failed

My Windows username is dictated by my company, I can't change it..

laszlocsomor commented 2 years ago

I am still running into this in the very latest Basel 6.0.0 pre-release :(

Hey @maxgerhardt , can you try the --output_user_root flag? Example:

bazel --output_user_root=c:/b build //foo:bar
laszlocsomor commented 2 years ago

One more thing @maxgerhardt , I think this bug is more relevant: https://github.com/bazelbuild/bazel/issues/5273

Is that correct?

tomgr commented 2 years ago

I've just run into this issue whilst trying to use rules_webtesting. Unfortunately, neither --nobuild_runfile_links nor --experimental_inprocess_symlink_creation works (this is running on MacOS) and both generate errors like the one below (i.e. Failed to create runfiles symlinks). One other thing I noticed is that there is a difference in behaviour between bazel test and bazel run. For example, on rules_webtesting where is set:

$ bazel run --nobuild_runfile_links //javatests/com/google/testing/web:WebTestTest_chromium-local
ERROR: Error creating runfiles: Failed to create runfiles symlinks: build-runfiles failed: error executing command
  (cd /private/var/tmp/_bazel_tom/f90c16530db64ba46a3872bcb8b04839/execroot/io_bazel_rules_webtesting && \
  exec env - \
  /var/tmp/_bazel_tom/install/71ed47cad951a20fff87381f54639763/build-runfiles bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles_manifest bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles): Process exited with status 1
/var/tmp/_bazel_tom/install/71ed47cad951a20fff87381f54639763/build-runfiles (args bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles_manifest bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles): link or target filename contains space on line 132: 'io_bazel_rules_webtesting/external/org_chromium_chromium_macos_arm64/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Chromium Framework /private/var/tmp/_bazel_tom/f90c16530db64ba46a3872bcb8b04839/external/org_chromium_chromium_macos_arm64/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Chromium Framework'

whereas:

bazel test --nobuild_runfile_links //javatests/com/google/testing/web:WebTestTest_chromium-local

works fine.

Are there any other workarounds for this issue?

EdSchouten commented 2 years ago

Hey! I'm also affected by this issue, so I thought it would make sense to bite the bullet and change runfiles_manifest from a single space delimited file to newline delimited JSON. Be sure to check out PR #14676!

matthewjh commented 2 years ago

Is anyone working on a long term solution to this problem? Having this issue on MacOS with rules_webtesting.

Disclaimer: Just some hint/trick that might be useful as a workaround for some folks. This still needs to be fixed.

I just wanted to post what we have been doing in the majority of Bazel projects for rules_webtesting compatibility (where the Chromium darwin binaries contain a space). In short: The runfile space error only seems to occur when the runfile build links are being created. With the --nobuild_runfile_trees flag the runfile tree creation is deferred until actually needed (this had CI cache improvements with RBE for us as well).

The flag solves the problem nicely (for us at least) because the runfile trees are not actually created on Darwin where the sandbox strategy does not rely on runfile build links at all (more details on this here: 0324607)

# Do not build runfile forests by default. If an execution strategy relies on runfile
# forests, the forest is created on-demand. See: https://github.com/bazelbuild/bazel/issues/6627
# and https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df
# Note: This also works around an issue where Bazel does not support spaces in runfiles. The
# Chromium app files for Darwin contain files with spaces and this would break. For darwin though,
# the sandbox strategy is used anyway and runfile forests are not needed.
# Related Bazel bug: https://github.com/bazelbuild/bazel/issues/4327.
build --nobuild_runfile_links

That works, until you need to debug a browser test using bazel run, in which case runfiles try to build. Or is there some way to debug in a live browser using bazel test?

keith commented 1 year ago

Seems like there's no workaround for this in cases where you're building coverage so your dependencies' source files are linked as runfiles and could have spaces.

EDIT: Looks like --experimental_inprocess_symlink_creation works around this, but I'm not sure if that breaks something else

fmeum commented 1 year ago

@lberki Do you happen to have data on the drawbacks of --experimental_inprocess_symlink_creation?

lberki commented 1 year ago

No data than I can take off the shelf, sorry. At Google, our runfiles trees are large enough that it stopped being feasible like a decade ago.

meisterT commented 1 year ago

cc @tjgq

tjgq commented 1 year ago

To clarify (had a chat with Lukacs internally) - at Google we use a third implementation that isn't available in Bazel.

Unfortunately --experimental_inprocess_symlink_creation won't work if you're also setting --nobuild_runfile_links. I think this was an implementation oversight - when we added the flag we forgot to also update RunfilesTreeUpdater, which creates the symlink trees just-in-time for local actions when --nobuild_runfile_links is set.

My personal opinion is that we should fix this issue [EDIT: it has been fixed in 7b87ae1] and then attempt to make in-process the default across the board (for Bazel) and delete the out-of-process implementation, but I'm not planning to work on it right now (although I might end up making incremental progress towards it while fixing #18580).

fmeum commented 1 year ago

@tjgq Could you briefly explain the key properties in which Google's implementation differs from the ones available in Bazel?

tjgq commented 1 year ago

We have an output tree backed by a FUSE daemon and directly tell the daemon to create the symlinks in the FUSE filesystem. This is similar to what's being discussed at https://github.com/bazelbuild/bazel/pull/12823.

mikhail-g commented 9 months ago

Is there a way to use a delimiter other then a whitespace for Runfiles manifest entry? e.g. , : or a tab symbol

sluongng commented 2 months ago

It seems like the workaround/fix for this issue is to turn on --experimental_inprocess_symlink_creation.

The only drawback mentioned above was fixed in https://github.com/bazelbuild/bazel/commit/7b87ae1b81b681d4fc7e42282b6966fbcaf6301c which landed in Bazel 7.0.0.

There has been another inquiry on Slack about spaces in the runfiles. I suggest we turn the flag on by default in 7.3.0.

As for the manifest file itself, I think https://github.com/bazelbuild/bazel/pull/14676 is "almost" there but needs some additional effort to take it over the finish line. If folks are interested in making the runfiles manifest format a bit easier to parse, take a look there.

adamscybot commented 1 month ago

@sluongng I would just like to point out this breaking issue with rules_js that I found when --experimental_inprocess_symlink_creation is turned on in Bazel 7.2.

Unsure if its Bazel, or the rules that aren't vibing with this in the way that is expected, but the rule set is popular enough you may want to see it resolved first.

I suspect I'm the first-mover to come across this.

Here is the ticket: https://github.com/aspect-build/rules_js/issues/1877

abravalheri commented 1 week ago

Cross referencing this issue as the likely root cause of the problem in https://github.com/pypa/setuptools/issues/4605.

As previously noted in https://github.com/bazelbuild/rules_python/issues/617 previous workarounds are not bullet proof and there was no guarantee that similar problems would not happen again due the limitation in bazel itself (it is not really a bug in setuptools codebase).