bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.99k stars 4.03k forks source link

Expansion of $(location) on path with spaces quotes too many times #6531

Open ob opened 5 years ago

ob commented 5 years ago

Tested with Bazel 0.18.0 and master as of d4e3ad8951.

Small test to reproduce at: https://github.com/ob/bazel-tests/tree/master/spaces

If I run:

$ bazel build --spawn_strategy=standalone -s --sandbox_debug //spaces:no_spaces

I get this output (paths removed and output elided for clarity):

DEBUG: arule.bzl:3:5: Expanding Location Placeholders...
DEBUG: arule.bzl:4:5: Values: ["-c", "$(location adir/foo.c)"]
DEBUG: arule.bzl:5:5: Expansion: ["-c", "spaces/adir/foo.c"]

SUBCOMMAND: # //spaces:no_spaces [action 'SkylarkAction spaces/adir/foo.o']
(cd /private/var/tmp/_bazel_obonilla/124df79d9592378b17b94075fb66c278/execroot/__main__ && \
  exec env - \
  cc -c spaces/adir/foo.c -o bazel-out/darwin-fastbuild/bin/spaces/adir/foo.o)

And the build succeeds. However, if I run:

$ bazel build --spawn_strategy=standalone -s --sandbox_debug //spaces:spaces

I get this output:

DEBUG: arule.bzl:3:5: Expanding Location Placeholders...
DEBUG: arule.bzl:4:5: Values: ["-c", "$(location a dir/foo.c)"]
DEBUG: arule.bzl:5:5: Expansion: ["-c", "'spaces/a dir/foo.c'"]

SUBCOMMAND: # //spaces:spaces [action 'SkylarkAction spaces/a dir/foo.o']
(cd /private/var/tmp/_bazel_obonilla/124df79d9592378b17b94075fb66c278/execroot/__main__ && \
  exec env - \
  cc -c ''\''spaces/a dir/foo.c'\''' -o 'bazel-out/darwin-fastbuild/bin/spaces/a dir/foo.o')
ERROR: BUILD:13:1: SkylarkAction spaces/a dir/foo.o failed (Exit 1): cc failed: error executing command 
  (cd /private/var/tmp/_bazel_obonilla/124df79d9592378b17b94075fb66c278/execroot/__main__ && \
  exec env - \
  cc -c ''\''spaces/a dir/foo.c'\''' -o 'bazel-out/darwin-fastbuild/bin/spaces/a dir/foo.o')
clang: error: no such file or directory: ''spaces/a dir/foo.c''
clang: error: no input files
Target //spaces:spaces failed to build

Note that the _inputfile is double-quoted. The ctx.expand_location() function seems to have realized there are spaces, and it single quoted the path. Later on, the action quoted it again.

aiuto commented 5 years ago

Not sure if this is Starlark or Execution. We'll have to sort it out.

regisd commented 5 years ago

Related to #3475

laurentlb commented 5 years ago

The $(location) command has a very subtile (and undocumented?) effect of adding single quotes when the input contains a space.

laurentlb commented 5 years ago

The issue is not specific to Starlark, because I can reproduce it with genrule:

genrule(
    name = "gen",
    srcs = ["a dir/a.cc"],
    outs = ["out"],
    cmd = "echo \"$(location a dir/a.cc)\" && touch $@",
)
ob commented 5 years ago

The $(location) command has a very subtile (and undocumented?) effect of adding single quotes when the input contains a space.

Is that to work around toolchains that can't deal with spaces? IMHO, it seems wrong to quote at that level. I think quoting should be done as close to the exec as possible no?

Maybe the solution is just to remove that single-quoting from $(location)?

ob commented 5 years ago

Here's a bit of justification for my rationale: I found the bug through the API ctx.expand_location, which is used in StarLark so you naturally wouldn't expect it to quote.

E.g. if I was dealing with a toolchain that didn't handle spaces, I could always quote the whole $(location ...) section. For instance -iquote '$(location path with spaces)', whereas removing quotes because we quote too much would be difficult and error prone.

laurentlb commented 5 years ago

I agree the sane fix is to remove the quotes.

ob commented 5 years ago

Having said that, I'm not sure what to do about $(locations ...)... the example in #3475 would still be a problem:

filegroup(
  name = "files",
  srcs = glob(["dir with spaces/**/*"]),
)

genrule(
  name = "example",
  srcs = [":files"],
  outs = ["foo"],
  cmd = "echo FILES: $(locations //:files) | tee $@",
)
regisd commented 5 years ago

IMHO, $(location a dir/a.cc) should fail in the first place with: location takes 1 arg, but 2 were given. Build file authors should use instead $(location 'a dir/a.cc')

laurentlb commented 5 years ago

Assuming it was caused by https://github.com/bazelbuild/bazel/issues/3475, cc @katre

katre commented 5 years ago

It's been quite a while since I looked into this code, so I don't have any special insight to offer.

laurentlb commented 5 years ago

cc @laszlocsomor and @meteorcloudy, as it probably affects Windows users more

comius commented 12 months ago

I think this functionality deserves a revamp, however not sure how it looks like.

aiuto commented 12 months ago

+1 to revamp. I've thought about this many times. My strawman proposal is.

fmeum commented 12 months ago

I think this functionality deserves a revamp, however not sure how it looks like.

I don't have concrete ideas for the API, but I think that any "v2" should:

  1. be implemented in Starlark and maintained outside but close to the Bazel core so that users can pin and update to a particular version of expansion location.
  2. return a "structured" command line, e.g. a ctx.actions.args. This would help quite a bit with ongoing efforts to rewrite paths to be more cache-friendly (CC @gregestren, this is how we could get rid of the --javacopts hack).
phst commented 8 months ago

As a lower cost solution, as a rule author I never pass files paths on the command line. I spill them into a file through that capability in ctx.action.args and have my tools read list of paths from command files. Yes, this can be slightly more overhead, but I don't have to worry about shell escaping.

Hmm, AFAIK when using ctx.actions.run there should never be any shell escaping? In fact, using a param file might cause more trouble because of embedded newlines - they need to be escaped and unescaped as well.