tweag / rules_haskell

Haskell rules for Bazel.
https://haskell.build
Apache License 2.0
266 stars 80 forks source link

rules_haskell requires python2 on Ubuntu when not using Nix #1173

Closed aherrmann closed 2 years ago

aherrmann commented 4 years ago

Describe the bug rules_haskell in GHC bindist mode (not Nix shell) requires the python package on Ubuntu which provides Python 2.

To Reproduce Try to run rules_haskell test suite on Ubuntu image without Python 2 installed. See https://buildkite.com/tweag-1/rules-haskell/builds/667#a3798e34-7be3-433f-bfa3-0aca1b0a303f/36-2757.

<no location info>: error:
--
  | Warning: Couldn't figure out C compiler information!
  | Make sure you're using GNU gcc, or clang
  | /usr/bin/env: ‘python’: No such file or directory
  | `cc_wrapper-python' failed in phase `Assembler'. (Exit code: 127)

Expected behavior Python 3 alone (python3 on Ubuntu) should be sufficient.

Environment

Additional context The cc_wrapper script that raises the error is a py_binary target. The source file specifies #!/usr/bin/env python3, however, it seems that Bazel generates a Python script requiring /usr/bin/env python anyway.

cc @flokli

smelc commented 4 years ago

I have what looks like a similar issue using Ubuntu18.04, with python2 installed:

> python --version
Python 2.7.17
> python3 --version
Python 3.6.9
> bazel test //...
...
ERROR: /home/churlin/.cache/bazel/_bazel_churlin/a2ae29489331b4803c5f3e551b1f4565/external/stackage-zlib/BUILD.bazel:14:1: HaskellCabalLibrary @stackage-zlib//:zlib failed (Exit 127) cabal_wrapper failed: error executing command bazel-out/host/bin/haskell/cabal_wrapper lib:zlib zlib-0.6.2 true external/stackage-zlib/zlib-0.6.2/Setup.hs external/stackage-zlib/zlib-0.6.2 ... (remaining 10 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
/usr/bin/env: ‘python’: No such file or directory

Version of the rules: 8f7d75961a1f459b4edb5faeb6e75dd251a2147f, bazel version: 2.0.0

smelc commented 4 years ago

Here are more details (version of the rules: 9a52a45410e0aa0e7005a572b40bf1320fd65220).

One of the nix environment built when running tests without being in a nix-shell seems to miss python:

> bazel test --sandbox_debug //...
ERROR: /home/churlin/.cache/bazel/_bazel_churlin/093fbb2dac96d2e44160180633c6fc6f/external/rules_haskell_worker_dependencies/BUILD.bazel:207:1: HaskellCabalLibrary @rules_haskell_worker_dependencies//:microlens failed (Exit 127) linux-sandbox failed: error executing command 
  (cd /home/churlin/.cache/bazel/_bazel_churlin/093fbb2dac96d2e44160180633c6fc6f/sandbox/linux-sandbox/2/execroot/rules_haskell && \
  exec env - \
    LANG=C.UTF-8 \
    LOCALE_ARCHIVE=external/glibc_locales/lib/locale/locale-archive \
    PATH=/nix/store/22lc47f02jaf2qc2p2h20mj5pp54lr01-posix-toolchain/bin \
    TMPDIR=/tmp \
  /home/churlin/.cache/bazel/_bazel_churlin/install/d687aca87a7669724cc958527f2423da/linux-sandbox -t 15 -w /home/churlin/.cache/bazel/_bazel_churlin/093fbb2dac96d2e44160180633c6fc6f/sandbox/linux-sandbox/2/execroot/rules_haskell -w /tmp -w /dev/shm -D -- bazel-out/host/bin/haskell/cabal_wrapper lib:microlens microlens-0.4.10 true external/rules_haskell_worker_dependencies/microlens-0.4.10/Setup.hs external/rules_haskell_worker_dependencies/microlens-0.4.10 bazel-out/k8-fastbuild/bin/external/rules_haskell_worker_dependencies/microlens-0.4.10/_install/microlens-0.4.10.conf.d '--flags=' '--ghc-option=-w' '--ghc-option=-optF=-w' --)
...
... lots of lines
...
src/main/tools/linux-sandbox-pid1.cc:265: remount rw: /dev/shm
src/main/tools/process-tools.cc:118: sigaction(32, &sa, nullptr) failed
src/main/tools/process-tools.cc:118: sigaction(33, &sa, nullptr) failed
/usr/bin/env: ‘python’: No such file or directory
src/main/tools/linux-sandbox-pid1.cc:437: waitpid returned 2
src/main/tools/linux-sandbox-pid1.cc:457: child exited with code 127
src/main/tools/linux-sandbox.cc:204: child exited normally with exitcode 127
INFO: Elapsed time: 17.212s, Critical Path: 0.16s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

In this context:

> ls -1 /nix/store/22lc47f02jaf2qc2p2h20mj5pp54lr01-posix-toolchain/bin | grep python

yields nothing.

aherrmann commented 4 years ago

In this context:

> ls -1 /nix/store/22lc47f02jaf2qc2p2h20mj5pp54lr01-posix-toolchain/bin | grep python

yields nothing.

Yes, that is expected. The POSIX toolchain does not include Python.

One of the nix environment built when running tests without being in a nix-shell seems to miss python:

I'm not sure I understand what that means. Can you give instructions how to reproduce this setup?


/usr/bin/env: ‘python’: No such file or directory

That's the part that's concerning. Both in bindist mode and nixpkgs mode we're using a Python toolchain (py_runtime_pair) that points to a Python interpreter under an absolute path. So, we should never get to a point where /usr/bin/env tries to find python in PATH.

The cabal_wrapper is defined using py_binary so it should use a Python toolchain. Could you rerun with --toolchain_resolution_debug to see which Python toolchain Bazel chooses here?

smelc commented 4 years ago

Yes, that is expected. The POSIX toolchain does not include Python.

Yes I didn't intend to highlight the posix-toolchain part, but instead wanted to point out that this path is the only element of PATH and that python is indeed not there.

Can you give instructions how to reproduce this setup?

Here it is (running Ubuntu 18.04):

> git clone https://github.com/tweag/rules_haskell
> cd rules_haskell
> bazel test //... # bazel version is 2.0.0

Could you rerun with --toolchain_resolution_debug to see which Python toolchain Bazel chooses here?

Here it is: https://gist.github.com/smelc/90e399481f4638f7255393cd9b59f94d

smelc commented 4 years ago

So, we should never get to a point where /usr/bin/env tries to find python in PATH.

For the record, I'm observing that bazel-out/host/bin/haskell/cabal_wrapper indeed has /usr/bin/env python in the shebang line. This is the case because this file is generated from this template: https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt. Am I right thinking the issue is that this template's shebang line is not parameterized (and hence cannot honor our python toolchain instance)?

The python3 path is correctly found by _configure_python3_toolchain_impl: it finds /usr/bin/python3.

aherrmann commented 4 years ago

Ah, indeed the python_stub_template.txt uses the shebang /usr/bin/env python. That stub itself then calls the toolchain provided Python interpreter, but it requires python in PATH to bootstrap. See https://github.com/bazelbuild/bazel/issues/8685 and https://github.com/bazelbuild/bazel/issues/8446.

This specific issue is probably caused by us overwriting $PATH. If we instead only prepend to $PATH then it would at least work if a python is installed in the standard system paths. This could be done by setting, say, $EXTRA_PATH in haskell/cabal.bzl and then setting PATH="$EXTRA_PATH:$PATH" in the Cabal wrapper.

aherrmann commented 4 years ago

Could you rerun with --toolchain_resolution_debug to see which Python toolchain Bazel chooses here?

Here it is: https://gist.github.com/smelc/90e399481f4638f7255393cd9b59f94d

Thank you, yes it's picking the right Python toolchain.

Regarding the failure in the bottom of that log. That's because Bazel ends up using the Nix provided C toolchain despite using non-Nix toolchains for everything else. That's a consequence of https://github.com/tweag/rules_nixpkgs/issues/88. I.e. for the C toolchain we're not making use of Bazel's toolchain selection mechanism. Instead, if nix-build is in PATH then we're always using a Nix provided C toolchain. A simple workaround is to drop nix-build from PATH, e.g.

export PATH=`echo ${PATH} | awk -v RS=: -v ORS=: '/.nix-profile/ {next} {print}'`

The correct solution is to resolve https://github.com/tweag/rules_nixpkgs/issues/88.

smelc commented 4 years ago

Ultimately my issue is that I was trying to have too many tests pass. Some tests in a no Nix environment (no NixOS, Ubuntu without nix) are known to fail and this comment aims to record that. To document what we mean by this environment, here is how to reproduce it:

Make sure .bazelrc.local does NOT contain (you may have it if you usually build within nix-shell):

build --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgs
run --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgs

Then remove nix-build from the environment (if you have it), so that rules_haskell does NOT try to use it (see https://github.com/tweag/rules_haskell/issues/704 for a discussion and https://github.com/tweag/rules_haskell/issues/704#issuecomment-515437347 for the trick's source)

export PATH=`echo ${PATH} | awk -v RS=: -v ORS=: '/.nix-profile/ {next} {print}'`

In this state, you should observe that bazel test //... fails. The reason is twofold:

To circumvent that:

Now bazel test //tests/... should pass.