Open ob opened 5 years ago
Not sure if this is Starlark or Execution. We'll have to sort it out.
Related to #3475
The $(location)
command has a very subtile (and undocumented?) effect of adding single quotes when the input contains a space.
The issue is not specific to Starlark, because I can reproduce it with genrule:
genrule(
name = "gen",
srcs = ["a dir/a.cc"],
outs = ["out"],
cmd = "echo \"$(location a dir/a.cc)\" && touch $@",
)
The
$(location)
command has a very subtile (and undocumented?) effect of adding single quotes when the input contains a space.
Is that to work around toolchains that can't deal with spaces? IMHO, it seems wrong to quote at that level. I think quoting should be done as close to the exec
as possible no?
Maybe the solution is just to remove that single-quoting from $(location)
?
Here's a bit of justification for my rationale: I found the bug through the API ctx.expand_location
, which is used in StarLark so you naturally wouldn't expect it to quote.
E.g. if I was dealing with a toolchain that didn't handle spaces, I could always quote the whole $(location ...)
section. For instance -iquote '$(location path with spaces)'
, whereas removing quotes because we quote too much would be difficult and error prone.
I agree the sane fix is to remove the quotes.
Having said that, I'm not sure what to do about $(locations ...)
... the example in #3475 would still be a problem:
filegroup(
name = "files",
srcs = glob(["dir with spaces/**/*"]),
)
genrule(
name = "example",
srcs = [":files"],
outs = ["foo"],
cmd = "echo FILES: $(locations //:files) | tee $@",
)
IMHO, $(location a dir/a.cc)
should fail in the first place with: location takes 1 arg, but 2 were given.
Build file authors should use instead $(location 'a dir/a.cc')
Assuming it was caused by https://github.com/bazelbuild/bazel/issues/3475, cc @katre
It's been quite a while since I looked into this code, so I don't have any special insight to offer.
cc @laszlocsomor and @meteorcloudy, as it probably affects Windows users more
I think this functionality deserves a revamp, however not sure how it looks like.
+1 to revamp. I've thought about this many times. My strawman proposal is.
genrule.cmd
and migrate to genrule.args
. ctx.action.args
apiargs
is not always what we execute.
executable
is what we execv
. args[0]
is what we pass to the executable in argv
. Feels like busybox a bit.in genrule and ctx we stop constructing a command line and passing it to the command interpreter.
exec
and pass unquoted strings to exec, since we are not going through a shell.As a lower cost solution, as a rule author I never pass files paths on the command line. I spill them into a file through that capability in ctx.action.args
and have my tools read list of paths from command files. Yes, this can be slightly more overhead, but I don't have to worry about shell escaping. That doesn't help the quick and easy generule.
I think this functionality deserves a revamp, however not sure how it looks like.
I don't have concrete ideas for the API, but I think that any "v2" should:
ctx.actions.args
. This would help quite a bit with ongoing efforts to rewrite paths to be more cache-friendly (CC @gregestren, this is how we could get rid of the --javacopts
hack).As a lower cost solution, as a rule author I never pass files paths on the command line. I spill them into a file through that capability in ctx.action.args and have my tools read list of paths from command files. Yes, this can be slightly more overhead, but I don't have to worry about shell escaping.
Hmm, AFAIK when using ctx.actions.run
there should never be any shell escaping?
In fact, using a param file might cause more trouble because of embedded newlines - they need to be escaped and unescaped as well.
Tested with Bazel 0.18.0 and master as of d4e3ad8951.
Small test to reproduce at: https://github.com/ob/bazel-tests/tree/master/spaces
If I run:
I get this output (paths removed and output elided for clarity):
And the build succeeds. However, if I run:
I get this output:
Note that the _inputfile is double-quoted. The
ctx.expand_location()
function seems to have realized there are spaces, and it single quoted the path. Later on, the action quoted it again.