Closed mouadino closed 3 years ago
When building default to python2
I strongly disagree. Python 2 should not even be supported imho, but it definitely should NOT be made default less than 2 years before upstream's extended support ends. Unfortunately a lot of Bazel's internal stuff is still thinking that Python 2 is system default, this would be a good time to update it instead of keeping it on life support even longer. End the suffering.
Build on previous point, if py_runtime works in same say way go rules do, which use the new toolchains idea it will be awesome.
I agree, especially keeping in mind that there are several Python interpreters out there (cPython just being one of them - there's also pypy, jython, cython...) and that it would be nice to be able to easily choose from a variety of them for releases, tests and benchmarks.
Thanks for your feedback, just to clarify when I said:
When building default to python2
I meant as a workaround (unless you don't agree with the workaround if so I would love to hear better alternatives), b/c as you said it's the current state of some Bazel extensions, I would love if we don't have to support Python2 anymore and that what we did in my current company, but since we adopted Bazel we are back to supporting it, and defaulting to Python2 in build seems to do it, for now at least, things will be much worse if there are extensions that require Python3 and other Python2, especially since setting up Python version is a global thing, so in this move toward Python3 please make sure to both are supported until transitions end.
IMHO the most important thing that I am trying to emphasize is to not make python version a global thing, which the mistake that --python_top
did, Python is not like Golang in that regard so IMHO we need a way to specify Python version at target/code level, thoughts?
I think one of the questions is if a py_binary
should be a self contained executable (= even if you don't have Python installed, it should run) or just contain Python code (= using one of the installed Python interpreters).
I guess most people would expect the latter and at that point I would rather see PEP394 (https://www.python.org/dev/peps/pep-0394/) being used instead of some environment variable decorator magic. A #!/usr/bin/env python3
line should be enough.
There is the problem that Bazel itself also depends on Python being present on the host system for some of its scripts/rules and it mostly expects Python2 there (see https://github.com/bazelbuild/bazel/issues/1580). I don't know if that's going to be phased out, now that the skylark language runtime has been rewritten in go and if they might be convinced to rewrite the remaining few helpers in go. Alternatively Bazel would need some way to have a vendored Python binary that it uses to run its internal rule scripts instead of relying on Python being present on its host machine.
The Python rules here seem to suffer a bit from this assumption that Python (unlike e.g. the go
binary) anyways has to be present on the build system since it has to run Bazel and Python has to be installed for that. Making sure that the system Python is never used when building Python rules with Bazel is an important step in my opinion and to ensure this, it might also be beneficial to make sure that Bazel itself doesn't actually depend on any Python being installed on the system.
I guess most people would expect the latter and at that point I would rather see PEP394 (https://www.python.org/dev/peps/pep-0394/) being used instead of some environment variable decorator magic. A #!/usr/bin/env python3 line should be enough.
I agree about the self-contained Python IMHO self-contained Python package is nice to have but not necessary, however about using #!/usr/bin/env python3
shebang, overall I think it's the right approach but just to play the devil advocate, the few issues that I can see are:
PATH
environment variable, which may be good and bad, to just give you an example we found out that it's necessary to limit PATH
e.g with --action_env=PATH=/bin:/usr/bin:/usr/local/bin
in .bazelrc else you may be surprised at what ppl have in their PATH
, on the other hand, this just a workaround b/c someone may be using pyenv or such, this is just to say that sandboxing is better but expensive.#!/usr/bin/env python
b/c that can be either Python2 or Python3 depending on OS.#!/usr/bin/env python3
can refer to the first python3 found in PATH
but that can end up be the wrong Python3 since some features are specific to a version e.g. python3.6.#!/usr/bin/env python3
in your script b/c you need the cooperation of Bazel, in the end, this later control which Python to use check the python_stub_template.txt
below.From the good side, I can see it fixing issues when containerizing python code b/c currently if you use host machine python path in py_runtime
and you package the code in a container, then you should have the same python path inside your container b/c python_stub_template.txt template will end up pointing to py_runtime
's interpreter_path
variable.
Alternatively Bazel would need some way to have a vendored Python binary that it uses to run its internal rule scripts instead of relying on Python being present on its host machine.
Isn't this already feasible with py_runtime
already? at least if the python scripts aren't called from WORKSPACE, as in theory, you can do:
new_http_archive(
name = "python_version",
urls = ["https://www.python.org/ftp/python/3.6.3/Python-3.6.3.tgz"],
strip_prefix = "Python-3.6.3",
build_file_content = """
genrule(
name = "build_python",
srcs = glob(["**"]),
outs = ["python"],
cmd = "./external/python_version/configure && make && cp python $@",
visibility = ["//visibility:public"],
)""",
)
The Python rules here seem to suffer a bit from this assumption that Python (unlike e.g. the go binary) anyways has to be present on the build system since it has to run Bazel and Python has to be installed for that.
I can see that you're looking at this from the lenses of Bazel core developer, from a user perspective I can tell you that Python suffers the assumption that by just pointing code to an interpreter and that's it, sadly it's not that easy, besides it seems that Google use Python a bit different, hence the awkward support of Python in Bazel that probably was inherited from Blaze, in theory you should:
I am not a Bazel core developer or even employed by Google, I'm a normal user...
@MarkusTeufelberger Ahh sorry for the confusion, but then just to reiterate, our experience has shown that the workaround above works for us, most of our issues concerning python version were due to extensions and not Bazel internal, but it's possible that our use cases are different than yours, that's why we see different issues.
Yes, it's good to remember that Python, and software developing in general, happens in a wide variety of environments, and so people's requirements can legitimately be very different. One person's "must have feature" is another's "we would never use this", and that's fine.
Here's a discussion group that may be of interest: https://groups.google.com/forum/#!forum/bazel-sig-python
Here's a document proposing a py_toolchain
target for Bazel. I also talk about the evolution of Python usage inside Google, to give some background about why Bazel is the way it is. And I talk about some ways that Bazel can evolve to serve a wider community. Let me know what you think: https://docs.google.com/document/u/1/d/1dQjbbLEJqxUIJWmH5sIZAv-_emnKksDI-VCG1v86dWA/edit#
Is there already a solution to this feature request? I can't see how all these issues relate to the question or if they avoid the initially mentioned workaround. If it is fixed by now, can you please provide a small example code.
I managed to make that work properly for me by specifying python_version="PY3"
on py_binary
and py_test
targets + adding the following to my .bazelrc
:
build --incompatible_use_python_toolchains
- this tells Bazel to use python toolchains, which should auto-detect your platform python 2 and 3 runtimes and use the appropriate one based what you specify in your target. You should be able to take than further, and specify your runtimes explicitly if you use custom paths or want to download the interpreter from a remote repository (see: https://github.com/bazelbuild/bazel/blob/master/tools/python/toolchain.bzl).
In addition, I added build --noincompatible_py3_is_default
in order to avoid python 3 from being selected when no python_version
is explicitly specified.
I use Bazel 0.26.0 (I think python toolchains have been introduced in 0.25-ish) and the following rules_python
version:
git_repository(
name = "io_bazel_rules_python",
commit = "fdbb17a4118a1728d19e638a5291b4c4266ea5b8",
remote = "https://github.com/bazelbuild/rules_python.git",
)
Checking the "Goals" section of the original issue description:
Add an attribute to py_binary and py_test rules to specify which python interpreter to use ....
python_version=PY3
py_runtime
should be specified in WORKSPACE ...
The Python toolchain is not available at WORKSPACE evaluation time, but with the pip_install
repo rule you can specify a label in the python_interpreter_target
attribute that is the same label as what's passed to py_runtime's interpreter
(Label: optional) attribute. Eg.
pip_install(
name = "foo_pip",
python_interpreter_target = "@python3//:bin/python3",
requirements = "//tools/build/dependencies/python:requirements.txt",
)
Build on previous point, if py_runtime works in same say way go rules do, which use the new toolchains idea it will be awesome.
Bazel Toolchains work with the Python rules now.
shipping the interpreter with it will be very convenient...
With --build_python_zip
and an in-build interpreter target you can ship the interpreter within an executable .zip
.
Given the above, I think this issue can be closed, but please let me know if there's something still missing and we can create a new more specific issue.
Unfortunately it's not possible to use python_interpreter_target
with py_binary
:
no such attribute 'python_interpreter_target' in 'py_binary' rule
Toolchains seem overly complicated for our use-case (we have multiple py_binary
, where some py_binaries need a different python interpreter than other py_binaries). In fact I've spent 2 days on this now and still can't figure out at all how to make this work.
I would also be interested in this feature being available.
Hey all,
Problem Statement
It's quite confusing how a user can specify that a given rule need to use Python3 or that rule need Python2, where we have multiple attributes documented like
default_python_version
,srcs_version
and undocumented like:py_interpreter
(which can't be used but out of curiosity what is the prefix:
doing), however I am still confused regarding what does what, in the end, what I think we should have is a way to tell py_binary to execute using the interpreter X, I know aboutpy_runtime
which is supposed to be the future but there are multiplepy_binary
rules out there with unspecify python version which lead to issues of python3 and 2 compatibilities, beside there is an assumption with the introduced with--python_top
argument that all code need one interepreter instead of having local to each target.Workaround
Currently we use the following workaround, maybe useful for others that suffer same issue:
The Decorator Pattern
force_python3.py
:(This assume that we expose a
PYTHON3_PATH
environment variable with--action_env=PYTHON3_PATH=<path>
).py_binary
rule by pointingmain
attribute to the file above:FYI we have a workaround of our own to specify which python to use for pip requirements.
Goals
What I would like to see are the followings:
py_binary
andpy_test
rules to specify which python interpreter to use, it should be a label and point to apy_runtime
, if not specified revert to default py_runtime (for backward compatibility too).--python_top
argument b/c the assumption that you need one global Python version is bogus as explained above.py_runtime
should be specified in WORKSPACE so that rules that download dependencies can use it, e.g. https://github.com/bazelbuild/rules_python#importing-pip-dependencies.py_runtime
works in same say way go rules do, which use the new toolchains idea it will be awesome.py_runtime
to an host binary path).