bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
22.69k stars 3.98k forks source link

undocumented usage of perl for cc_test #4691

Closed ahippler closed 5 years ago

ahippler commented 6 years ago

Description of the problem / feature request:

cc_test uses a inline perl script for failed tests. https://github.com/bazelbuild/bazel/blob/eb067ea88749a5635cc8ee8954cde2b767f1eb61/tools/test/test-setup.sh#L153

Feature requests: what underlying problem are you trying to solve with this feature?

The usage of perl is not documented. Windows does not have Perl installed by default.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

==================== Test output for //Test/Unit:Unit (shard 1 of 8):

[...]

external/bazel_tools/tools/test/test-setup.sh: line 152: perl: command not found
================================================================================

What operating system are you running Bazel on?

Windows 10

What's the output of bazel info release?

0.10.1

The perl script replaces invalid XML characters and invalid sequence in CDATA. To get rid of perl bash or python could be used.

aehlig commented 6 years ago

Categorizing as bug, as pulling in an documented run-time dependency can cause breakage for users not otherwise using perl. The main question is if we really need perl (then we should document it as a run-time dependency of bazel) or whether an appropriate set up can be achieved with more standard tools.

aehlig commented 6 years ago

It's the actual test rules that are problematic.

ulfjack commented 6 years ago

Note that this is the fallback code path, so if you make sure that your test framework generates a test.xml, that should workaround the issue.

Any suggestions on what to do here? We do need to escape the characters when generating the test.xml.

benjaminp commented 6 years ago

Perhaps rewrite test-setup.sh in a language like C++ or Python where text processing is easy to do portably? That might also be helpful for solving subtle test-setup.sh issues like #4608. (I can't imagine bash is very fun to run on Windows either.)

Relatedly, I would like the ability to customize the test setup executable. One can usually use --run_under, but then the user loses the ability to usefully specify their own --run_under.

ulfjack commented 6 years ago

We discussed that, but it means that we'd unconditionally need a C++ compiler in order to run tests. We could bundle a pre-compiled binary into Bazel, but what platform would it be compiled for? For cross-compilation, you need a binary for the target platform, not the host platform. We could bundle a binary and also ship the source, but then you still need a C++ compiler in order to do cross-compilation.

We could use Python, but then we require that everyone has python installed, even if they 'only' do web or C++ development.

We also discussed moving the fallback codepath into Bazel (i.e., have the test.xml generation in Bazel itself). However, that's also causing problems with remote execution, where we now would force test.xml generation onto the local machine. Except, of course, if we also require that remote execution provides X for some value of X.

It's not impossible to solve these problems, but there's no free lunch.

For now, I'd prefer we keep doing it in shell, but maybe there's a more standard tool than perl that we can use to do the escaping? awk? sed?

benjaminp commented 6 years ago

FWIW, Bazel doesn't work at all without a C++ compiler on the system:

$ bazel build
ERROR: in target '//external:cc_toolchain': no such package '@local_config_cc//': Traceback (most recent call last):
    File "/home/benjamin/.cache/bazel/_bazel_root/887904812217cca9bc2b9adb875daf42/external/bazel_tools/tools/cpp/cc_configure.bzl", line 42
        configure_unix_toolchain(repository_ctx, cpu_value, overriden...)
    File "/home/benjamin/.cache/bazel/_bazel_root/887904812217cca9bc2b9adb875daf42/external/bazel_tools/tools/cpp/unix_cc_configure.bzl", line 416, in configure_unix_toolchain
        find_cc(repository_ctx, overriden_tools)
    File "/home/benjamin/.cache/bazel/_bazel_root/887904812217cca9bc2b9adb875daf42/external/bazel_tools/tools/cpp/unix_cc_configure.bzl", line 407, in find_cc
        fail(("Cannot find gcc or CC%s, eithe...))
Cannot find gcc or CC, either correct your path or set the CC environment variable
ulfjack commented 6 years ago

That's true, and we should fix that independently of this problem. I'd rather not add another dependency on having a C++ compiler available.

laszlocsomor commented 6 years ago

That's true, and we should fix that independently of this problem.

+1

I'd rather not add another dependency on having a C++ compiler available.

I'd give another +1 if I weren't out of them already.

ahippler commented 6 years ago

We could use Python, but then we require that everyone has python installed, even if they 'only' do web or C++ development.

python is already required as per documentation: https://docs.bazel.build/versions/master/install-ubuntu.html#1-install-required-packages https://docs.bazel.build/versions/master/windows.html#software-requirements

laszlocsomor commented 6 years ago

@ahippler : Like with C++, that's true and we should fix that independently of this problem. In fact it'd be best if Bazel would only require toolchains that it needs for the build, i.e. if you don't build Python rules then Bazel shouldn't require a Python installation.

Only exception (if you could call it an exception) should be Java: the JDK is always available because we bundle one with Bazel, because Bazel itself needs one in order to run.

laszlocsomor commented 6 years ago

Thanks to @aehlig we now know what this command is supposed to do:

It means, read the string as sequence of uni-code characters, and replace every non-empty sequence of characters in the given ranges by a question mark symbol. (That's the first line. The ranges come from the esoteric definition of which unicode characters are and are not allowed in XML documents.) The negation (^) makes sense. The allowed unicode characters are: x9 | xA | xD | [x20-xD7FF] | [xE000-xFFFD] | [x10000-x10FFFF] And every non-empty sequence of other charachters is replaced by a question mark symbol. The reason why perl was used was that the whitelist only applies after you interpreted the sequence of octets as a sequnce of (utf-8 encoded) unicode characters, and perl could do that on the fly (while the more traditional unix tools work on sequnce of octets).

ulfjack commented 6 years ago

(Note that any solution for test-setup needs to work with remote execution - this makes it very difficult to use pre-compiled binaries, because you don't know which platform the test action will actually run on.)

laszlocsomor commented 6 years ago

@ulfjack : True. Bazel could include a precompiled embedded binary for the host platform, and the remote execution platform author would have to provide one. Bazel would select the active one with toolchain rules. WDYT?

laszlocsomor commented 6 years ago

Let's not forget that the tests currently depend on Bash, so a remote execution worker would have to have Bash installed anyway.

ulfjack commented 6 years ago

That would make it much more difficult to change test-setup since all such changes would have to be rolled out to all remote execution systems.

laszlocsomor commented 6 years ago

Let's not forget that the tests currently depend on Bash, so a remote execution worker would have to have Bash installed anyway.

...meaning that requiring a test-setup binary would not make things worse. And we could provide a reference implementation in a GitHub repo.

ulfjack commented 6 years ago

I think one option would be to fork test-setup, and have different implementations for Linux/Mac and Windows.

laszlocsomor commented 6 years ago

Forking test-setup would also introduce the synchronization difficulty you alluded to.

laszlocsomor commented 6 years ago

(Note that any solution for test-setup needs to work with remote execution - this makes it very difficult to use pre-compiled binaries, because you don't know which platform the test action will actually run on.)

The options I see:

The second option seems to be the best tradeoff. WDYT?

ulfjack commented 6 years ago

I agree.

Note that I've been working on splitting test-setup into two separate steps - right now it runs the test and then generates a test.xml file if there isn't one. The way it's done is triggering a code path on Linux and MacOS that has an inherent race condition. @agoulti proposed that we split up the two parts - run the test first and then run a separate action to generate the test.xml file if the test didn't generate one. That'll fix the race condition and potentially make the test-setup script a bit simpler.

ulfjack commented 6 years ago

There's some background in #4608.

benjaminp commented 6 years ago

The options I see: ...

There's also the option I've raised before of rewriting test-setup.sh in C++, requiring a target C++ toolchain to run tests, and compiling test-setup on the fly for the target platform. For simple use cases, a precompiled binary test-setup could be included with Bazel in the embedded tools. This has the advantage of letting test-setup use native APIs (managing processes properly in bash is ironically difficult) and not duplicating too much work across platforms. I understand the desire to avoid requiring a C++ toolchain, but it seems like the need for that ends up creeping in for any non-trivial project anyway.

laszlocsomor commented 6 years ago

@benjaminp : I forgot to add that option, thanks for mentioning it! I agree it's attractive for the reasons you listed, but the hard requirement on a C compiler conflicts with https://github.com/bazelbuild/bazel/issues/5133, so using scripts still seems to be the most advantageous (or least disadvantageous) approach. WDYT?

ulfjack commented 6 years ago

5133 is a different, unrelated issue. But the problem I see with requiring a C compiler is that the user experience of installing a C compiler on Windows and MacOS is very bad.

laszlocsomor commented 6 years ago

Thanks for pointing these out!

@benjaminp: what do you think in light of what @ulfjack and I wrote?

benjaminp commented 6 years ago

Is having a compiler toolchain harder than running a Bazel remote execution agent on a Windows or OSX machine? I think you should only have to deal with this installation problem if you're trying to use remote execution across incompatible platforms, which I suspect is a more advanced case.

At any rate, I don't want to get in the way—I would just like to see less bash in the world.

laszlocsomor commented 6 years ago

Is having a compiler toolchain harder than running a Bazel remote execution agent on a Windows or OSX machine?

Considering that the agent can be precompiled and embedded in Bazel (assuming you mean an agent for the host machine) and as such requires no action from the user, it is harder to install a compiler.

I agree in wanting less Bash, but in this case using scripts seems the lesser bad of the options.

It seems we converged on an agreement to reimplement test-setup.sh in a Windows-friendly script language such as Powershell and use that on Windows, and keep using the Bash script on Unixes.

benjaminp commented 6 years ago

Is having a compiler toolchain harder than running a Bazel remote execution agent on a Windows or OSX machine?

Considering that the agent can be precompiled and embedded in Bazel (assuming you mean an agent for the host machine) and as such requires no action from the user, it is harder to install a compiler.

I meant an agent for the machine the action is actually running on. ("host" is a bit ambiguous in this situation.) My claim is installing compilers on worker machines would only be a minor annoyance compared to actually running a remote execution cluster.

ulfjack commented 6 years ago

I think you're proposing that we ship bazel with a binary for the same platform that bazel runs on, as well as the source code. When remote execution is enabled, we'd not use the binary, but instead compile it from source using a remotely installed compiler.

I am not sure how many people are actually going to run a remote execution cluster. It's more likely that this will be offered as a service, with a bring your own docker container story. In that case, we'd still require that all such docker containers contain - say - a C++ compiler (right now we require that they contain bash for both linux or windows containers; there's no docker for macos I believe).

laszlocsomor commented 6 years ago

@ulfjack , I don't follow. Are you suggesting that it's likely that a compiler will always be available with remote execution? Shipping both a binary and its source code for remote compilation sounds like an interesting direction.

ulfjack commented 6 years ago

I'm not suggesting that it will automatically be available, but we could require that all remote execution images contain a certain minimum set of tools.

laszlocsomor commented 6 years ago

Right, that's what I meant. Does that mean that moving to C++ instead of forking test-setup and reimplementing it in powershell sounds good to you?

ulfjack commented 6 years ago

My preference is still shell + powershell, because it 'only' requires that a shell is installed in the execution images, which is almost always the case, including existing images (except for the distroless images that don't even come with a shell). I'm also concerned about divergence between Bazel and Blaze, and it's not obvious how difficult compiling the test wrapper binary at runtime will be - it's more difficult than you'd think.

ulfjack commented 6 years ago

(Blaze doesn't currently use the same test-setup.sh, but I'd like to converge the two.)

laszlocsomor commented 6 years ago

My preference is still shell + powershell, because it 'only' requires that a shell is installed in the execution images, which is almost always the case, including existing images (except for the distroless images that don't even come with a shell).

I see. It's my understanding that distroless images don't have a compiler either and would require a binary. Is that correct?

I'm also concerned about divergence between Bazel and Blaze,

I think introducing a Powershell version of test-setup with add to that problem. Replacing the existing shell scripts with C++ code would not worsen the situation. WDYT?

and it's not obvious how difficult compiling the test wrapper binary at runtime will be - it's more difficult than you'd think.

Why? I reckon the test-setup were a single cc_binary in this model, compiled for the execution platform, as if it was in genrule.tools. Bazel can already do this well. Am I missing something?

ulfjack commented 6 years ago

I see. It's my understanding that distroless images don't have a compiler either and would require a binary. Is that correct?

I don't think we need to worry about the distroless images. They aren't suited for building, and I don't foresee anyone asking us to support those.

I think introducing a Powershell version of test-setup with add to that problem. Replacing the existing shell scripts with C++ code would not worsen the situation. WDYT?

I don't think adding a Powershell version will make the divergence between Blaze and Bazel worse, per se. Making Bazel use a C++ binary while Blaze uses a shell script seems to increase divergence, not decrease it.

Let me also point out that shell is easier to extend at runtime - we may need to support company- or project-specific test configuration parts in order to be able to converge.

Why? I reckon the test-setup were a single cc_binary in this model, compiled for the execution platform, as if it was in genrule.tools. Bazel can already do this well. Am I missing something?

If you use cc_binary, you also have to have a matching cc_toolchain rule, which means the user has to actively set up their workspace file correspondingly, which means that they have to be aware of it. Making it so that the cc_toolchain is only required if you do remote execution is also hard. The naive approach would go against our desire not to require a C++ toolchain if you don't work with C++.

ulfjack commented 6 years ago

I think all the possible options are problematic in some way, and I prefer to stick with the status quo absent stronger reasons for changing. Certainly, we can't use bash on Windows, but I'm more inclined to treating that as a one-off than trying to unify all platforms and pushing the complexity on our users. Even if it's not a lot of complexity, it's a lot of users. A little bit of complexity times a lot of users = not ideal.

laszlocsomor commented 6 years ago

I don't think we need to worry about the distroless images. They aren't suited for building, and I don't foresee anyone asking us to support those.

Thanks. That resolves this subthread.

I don't think adding a Powershell version will make the divergence between Blaze and Bazel worse, per se. Making Bazel use a C++ binary while Blaze uses a shell script seems to increase divergence, not decrease it.

I considered rewriting to also cover the Blaze-specific version of test-setup.sh. But in light of your next argument about needing a cc_toolchain and the complexity implied, we can resolve this subthread too.

If you use cc_binary, you also have to have a matching cc_toolchain rule (...) Making it so that the cc_toolchain is only required if you do remote execution is also hard.

Good point.

I think all the possible options are problematic in some way, and I prefer to stick with the status quo absent stronger reasons for changing.

I'm out of arguments. Then let's try the Powershell approach for now. SGTY?

ulfjack commented 6 years ago

SGTM

laszlocsomor commented 6 years ago

Related bug: #5508.

jmillikin-stripe commented 6 years ago

Instead of Powershell, would it be practical to rewrite the wrapper in Java? As noted earlier that adds no additional dependencies, and it solves the problem for remote execution as well.

laszlocsomor commented 5 years ago

@jmillikin-stripe : Which problem of remote execution do you mean, and how does using Java solve it? If we used Java, Bazel would need a JDK on the remote machine.

jmillikin-stripe commented 5 years ago

The remote execution protocol doesn't copy binaries Bazel executes unless they're registered with Bazel as a tool. Since perl is being executed with a simple $PATH lookup, Bazel won't copy it to the worker and the command would fail with perl: not found.

If the wrapper was written as a java_binary or cc_binary and built on demand, Bazel would properly copy it to the remote worker. I assumed a Java binary would be easier to integrate since you could distribute the compiled .jar in the Bazel tarball (there are some other helpers in tools/ distributed in the same way).

Considering that the Bazel buildfarm worker itself is written in Java, having a JRE installed on the remote machine doesn't seem like a big hurdle.

laszlocsomor commented 5 years ago

True, perl is an undeclared dependency. There's a bug for that (https://github.com/bazelbuild/bazel/issues/5265) but I'm currently not working on it. Since nobody else updated the bug, I guess nobody is working on it at the moment.

Indeed a *_binary rule is advantageous because Bazel recompiles it from source, but it also requires a (cross)compiler for the execution platform.

As for C++, @ulfjack (who is one of the most senior Bazel engineers and a technical leader) earlier argued against depending on the C++ compiler for test execution.

As for Java the jar file is not enough: we also need a launcher that sets up the environment and the classpath for the jar file. The launcher is a shell script on Unixes and a C++ binary on Windows.

This is why we concluded that a PowerShell script, though not ideal, seems to be the best option comparatively.

WDYT?

laszlocsomor commented 5 years ago

I earlier wrote that https://github.com/bazelbuild/bazel/issues/5508 was a related bug, but I now think it subsumes this bug. Maybe we should merge the two. @jmillikin-stripe , @ahippler , what do you think?

jmillikin-stripe commented 5 years ago

As for Java the jar file is not enough: we also need a launcher that sets up the environment and the classpath for the jar file. The launcher is a shell script on Unixes and a C++ binary on Windows.

Maybe I'm misunderstanding how modern Java works, but can't you just compile a standalone .jar file that depends on only the standard library? The one-line Perl expression doesn't seem like something that would need a full CLASSPATH wrapper script to deal with. I know several of the java_binary targets I've used for helpers don't have any sort of special wrappers around them, just java -jar path/to/whatever.jar.

Sorry if this is going off into the weeds too much, it's just that I'm looking at the Perl command and then looking at the ~180 MB of Bazel installer and thinking there's got to be a way to solve this.

I earlier wrote that #5508 was a related bug, but I now think it subsumes this bug. Maybe we should merge the two.

5508 seems like a bigger challenge to solve, but could be solved at the same time if the entire test wrapper was rewritten. Up to you as to whether requiring Perl is a big enough blocker to justify fixing separately.

laszlocsomor commented 5 years ago

I agree, replacing that single Perl call shouldn't be hard with Java. The whole time I was thinking about replacing the entire test-setup.sh with a Java program.

laszlocsomor commented 5 years ago

I agree, replacing that single Perl call shouldn't be hard with Java. The whole time I was thinking about replacing the entire test-setup.sh with a Java program.

@nlopezgi @ola-rozenfeld Could you confirm that a JRE is always available on remote execution machines, and if so, under which path would test-setup.sh find it?

ulfjack commented 5 years ago

There's no guarantee that a JRE is available on all remote machines.

laszlocsomor commented 5 years ago

@ulfjack : for now there is, no?