Open rubensf opened 3 years ago
I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.
Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.
+1 to Ed - we could standardize arg0 in some way perhaps, but we cannot safely standardize anything about the remainder - we don't actually know if an argument is a path (to be transformed) or a string that happens to contain slashes (which servers should not touch in any way). And different tools will have different support for paths, especially on Windows. (UNC paths, etc). And at that point, it seems best to just say that the client is responsible for providing OS-specific and tool-understandable paths/arguments for all arguments, including arg0.
One wrinkle to that might be if cmd.exe adds any specific constraints that aren't "Windows standard" - my preference for documentation would be something along the lines of "args must be formatted correctly for the OS and for the tool they'll be passed to", but if there are ways to comply with that but also not be amenable to passing to cmd.exe directly, I guess that's on our server to wrap/transform as necessary anyways to bridge the gap.
On Tue, Mar 9, 2021 at 1:22 AM Ed Schouten notifications@github.com wrote:
I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.
Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-793449017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABREW4HJZE2UWHBKXFCF23TCW5A5ANCNFSM4Y22JH4A .
+1 to option 2. The client already has some knowledge (or makes assumptions) about what the remote environment looks like--for example, that executable /foo/bar/bas will exist in the remote environment--and this feels similar. I also agree that the problem of universally processing argv[n] is not solvable by the server, and that consistency is valuable here.
On Tue, Mar 9, 2021 at 11:44 AM Eric Burnett notifications@github.com wrote:
+1 to Ed - we could standardize arg0 in some way perhaps, but we cannot safely standardize anything about the remainder - we don't actually know if an argument is a path (to be transformed) or a string that happens to contain slashes (which servers should not touch in any way). And different tools will have different support for paths, especially on Windows. (UNC paths, etc). And at that point, it seems best to just say that the client is responsible for providing OS-specific and tool-understandable paths/arguments for all arguments, including arg0.
One wrinkle to that might be if cmd.exe adds any specific constraints that aren't "Windows standard" - my preference for documentation would be something along the lines of "args must be formatted correctly for the OS and for the tool they'll be passed to", but if there are ways to comply with that but also not be amenable to passing to cmd.exe directly, I guess that's on our server to wrap/transform as necessary anyways to bridge the gap.
On Tue, Mar 9, 2021 at 1:22 AM Ed Schouten notifications@github.com wrote:
I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.
Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-793449017 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AABREW4HJZE2UWHBKXFCF23TCW5A5ANCNFSM4Y22JH4A
.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-794143759, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMU23ZH7I5K62FQVUHZ52TTCZF7JANCNFSM4Y22JH4A .
I intended option 1 to apply exclusively to argv[0], since it should always (?) be a non-path separated function or command in the $PATH, or be the canonical path to an executable.
Maybe it also makes sense to define some reasonable entry point on the server side? The user or the server could have weird setups (eg a user specified docker container with a cygwin bash entrypoint) that might be incompatible with the OS default (eg cygwin). It'd be unreasonable for bazel as a client tool to make those assumptions?
The client also specifies the docker container, and that should allow it to make assumptions about the contents thereof.
I'm running into the same problem described in https://github.com/bazelbuild/bazel/issues/11636, which led me here. Is the consensus here that the right fix is to change the rules_go package to change the forward slashes to backslashes on Windows?
I think that the responsibilities are necessarily split between Bazel and the RBE system:
argv[0]
is formatted correctly (i.e. uses backslashes or is quoted on Windows).I will look into getting 1. fixed in Bazel and would like to raise 2. in one of the next meetings.
I'm running into the same problem described in bazelbuild/bazel#11636, which led me here. Is the consensus here that the right fix is to change the rules_go package to change the forward slashes to backslashes on Windows?
Yes. The consensus we reached above was that value in Command.Arguments[0] should use path-native separators for the target platform--so on Windows, Command.Arguments[0] should use "\". The current version of the spec is clear about this for Command.Arguments[0]. I think we have some work to clean up the spec for the remaining values of Command.Arguments.
- The RBE system receives an array of strings as arguments and the spec demands that the first argument is (only) the path of the executable, but on Windows the entire command line that is ultimately invoked is represented as a single string. This necessarily requires some amount of conversion and possibly even escaping on the server side, equivalent to escapeArgvRest in Bazel.
I will look into getting 1. fixed in Bazel and would like to raise 2. in one of the next meetings.
I think ideally we could just specify that the server treat Command.Arguments as opaque strings and not manipulate them at all. The server doesn't really know what the platform of the executing container will be, so it can't do platform-specific escaping. (In a practical sense, the server can make an educated guess based on the platform of the worker an Action is assigned to, but it's definitely possible to run a Linux container on a Windows host, and other combinations may be possible in the future.)
Is the client able to provide pre-escaped strings in Command.Arguments, rather than relying on the server to do the escaping?
Is the client able to provide pre-escaped strings in Command.Arguments, rather than relying on the server to do the escaping?
In theory yes, but to be compliant with the spec, it would need to allow escaped paths in Command.Arguments[0]
. Currently only paths are allowed by the spec and "foo/bar baz"
is not a valid path on Windows (and interpreted differently from foo/bar baz
on Unix).
The spec would also have to specify how it turns Command.Arguments[0]
into the argument passed to CreateProcess
on Windows (even if that's just " ".join(Command.Arguments)
).
In my Windows experiments, .exe
files can be started with both forward slash and backslash but for .bat
files only works with backslash.
Is the client able to provide pre-escaped strings in Command.Arguments, rather than relying on the server to do the escaping?
This relates to https://github.com/bazelbuild/remote-apis/issues/198 where an EscapedCommandLineAction
could be used. For simple command line like we use today, I think escaping on client side would complicate the spec more than necessary about escaping method, which in the end might not match the server side system (e.g. in case of emulated runs).
Motivation: https://github.com/bazelbuild/bazel/issues/11636#issuecomment-651154740
Throughout the API, both for inputs (in
SymlinkNodes
) and in outputs, we specify that separators should be single forward slashes (/
). However, we're not explicit on the Command arguments, which currently reads:The issue here comes from Windows
cmd.exe
not being able to naively recognize relative paths using/
separators as executables, eg:While there are many ways for the server to go around that restrictions (eg converting
/
->\
, wrapping the command in quotes, etc), we should it explicit whose responsibility it is to deal with this:/
or\
are valid, though I find little benefit to this approach, as it's a more ambiguous than 1 but forces the server to apply processing anyway (and maybe there exists a bizarre scenario where both are valid and yield different results?)I'm personally prefer 1. as the server may be already doing other processing on the command and figuring out the platform specific separator shouldn't be challenging, but I don't have a strong preference here as the approach is explicitly stated.