Open EricCousineau-TRI opened 6 years ago
I am pretty much on record with hating reinventing apt
and pip
with Bazel, so it would make me happy.
(For myself) A tiny breadcrumb for patching local PIP packages: https://stackoverflow.com/questions/5570666/patching-python-packages-installed-as-dependencies-with-pip
UPDATE: I believe I understand Robin's comments more now about virtualenv
; it would make supporting both Python2 and Python3 much (read: a crap ton) easier for managing the switches. Rather than throwing a crap ton of Py3 duplicates in our Bazel code, we just change the environment (if needed).
On the subject of needing numpy
fixes (not necessarily virtualenv or python2/3 stuff), I am not sure that I'm following, but my main concern is that in the same way libdrake.so
needs to play well with the rest of the system libraries at runtime, because Drake is a library not a program -- pydrake
needs to play well with the rest of the python environment at runtime. If we need to use non-default versions of things like numpy
, how to we know that other code's use of numpy
still works? I think we should go out of our way to be compatible with stable versions of common libraries, and not to require bleeding-edge versions.
(For myself) A tiny breadcrumb for patching local PIP packages: https://stackoverflow.com/questions/5570666/patching-python-packages-installed-as-dependencies-with-pip
I am certainly not in favor of patching packages.
Posted responses about package patching in #8116.
I had seen some discussions on Slack about virtualenv
, and ran into a blocking issue for #8452 that does require a two-line patch to numpy
(still weighing cost/benefit, and I'm definitely not excited about the need for patching).
I'm not sure if we've considered it before, but there do seem to be some mechanisms to enable having a checked-in (or, I would assume, a generated) Python runtime, py_runtime
and bazel build --python_top={target}
:
https://github.com/erain/bazel-python-example
My assumption is that if we have a self-contained virtualenv
(or use --system-site-packages
), then we can create a repository to generate this environment, and point to the generated Python binary to use for the project. (Or permit an external virtualenv
, and just use whatever binary and dependencies if people want to muck around with that.)
This could also pave the way towards containing support Python2+3 per #8352, most likely via a configuration switch.
I am not generally as worried about making our development and test environment work. I am most worried about how the installed pydrake interacts with the rest of the python universe. I think the clearest way to look at this question (and the wider python2/3 and forked-dependencies questions) is to first explain what the installed experience looks like, and then from there work out what the development environment looks like.
Given that I am strongly against patching packages, I am not sure a virtualenv
solves anything in the grand scheme of third-party Drake usage.
(My personal opinion is that whether to use a virtualenv
should be the choice of the end-user and not something required.)
Ran into another case of desiring this: Better ipywidgets
support in #14082
FYI @RussTedrake
A motivating use case is on the rise -- Homebrew seems incapable of making a Python minor version transition without breaking all of non-first-party Python libraries as they go (e.g., recently scipy
only works with 3.10, but the default python3
still points to 3.9).
It's probable that Drake on macOS should only use the Python interpreter from Homebrew, and for other Python ecosystem dependencies we use pip
(with a lockfile).
Update: I have somewhat of a prototype of this now in TRI Anzu. After we gain some experience with that over the next week or two, I'll dump a snapshot of the code into this ticket, along with some more specific goals, and we can take it from there.
For kitware:
venv
on macOS, for Drake source builds.
--break-system-packages
thing during install_prereqs.venv
will (eventually) need to be part of the provisioned images.Here's the sample Anzu code to get started: anzu-snippets.zip.
Notes:
requirements.{in,txt}
should live in the setup/...
tree somewhere, not at the project root.tools/workspace/default.bzl
to call the new venv_repository
rule.deps = ["@venv"]
on a py_library
or py_binary
target.This should at least be able to supplant the source_distribution
requirements management.
For binary_distribution/requirements.txt
, bazel stuff doesn't help at all. We'll need to do something different there.
I have no idea what an "anzu" is. I'm guessing all instances of that should be replaced with "drake"? If I do that, I can run the scripts, but it isn't clear how this is supposed to be actually consumed by other Drake bits that need the venv.
Also, if this is intended to replace what's currently pip install
ed, does that mean we'd drop the --[without-]test-only
split?
I have no idea what an "anzu" is
It's the codename for TRI's internal git repository.
I'm guessing all instances of that should be replaced with "drake"?
Yes.
If I do that, I can run the scripts, but it isn't clear how this is supposed to be actually consumed by other Drake bits that need the venv.
That's this part:
Also needs changes to
tools/workspace/default.bzl
to call the newvenv_repository
rule. Use the venv asdeps = ["@venv"]
on apy_library
orpy_binary
target.
The way to check is to find a test that needs something from pip in order to succeed, and then iterate to get that working.
Does that mean we'd drop the --[without-]test-only split?
Probably not. Users who are installing Drake from source (e.g. via CMake) should only by default need to pay the download cost of the non-testonly dependencies. The --with-test-only
should (now) be opt-in, for use only by the Drake Developers (and Drake CI).
One update from f2f: the unique challenge of "default" vs "testonly" (versus what Anzu does) is to have sync
know which choice should be used. The way we can do that is have install_prereqs.sh
(really, probably venv/setup
?) symlink the desired requirements.txt
into or nearby the venv. Then sync
can refer to the symlinked file, instead of hard-coding a specific requirements.txt
.
Okay, this one is puzzling:
FAIL: //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint (see .../execroot/drake/bazel-out/darwin_arm64-opt/testlogs/bindings/pydrake/common/py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint/test.log)
INFO: From Testing //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint:
==================== Test output for //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint:
Traceback (most recent call last):
File ".../execroot/drake/bazel-out/darwin_arm64-opt/bin/bindings/pydrake/common/py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint.runfiles/drake/../styleguide/cpplint/cpplint.py", line 56, in <module>
import six
ModuleNotFoundError: No module named 'six'
It seems like the correct patch (ignore the likely style error) is:
diff --git a/tools/workspace/styleguide/package.BUILD.bazel b/tools/workspace/styleguide/package.BUILD.bazel
index 03b5770120..8001a7291d 100644
--- a/tools/workspace/styleguide/package.BUILD.bazel
+++ b/tools/workspace/styleguide/package.BUILD.bazel
@@ -22,6 +22,7 @@ py_binary(
python_version = "PY3",
srcs_version = "PY3",
visibility = [],
+ deps = ["@venv"],
)
alias(
...but I'm still getting the above error.
Another issue... since we're only doing macOS for now, just adding deps=["@venv"]
breaks Linux. I suppose there's a way to say "this is only a dep on macOS", but that's a bunch of extra work that we expect to eventually remove. Is there instead a way to make @venv
a stub (for now) on not-macOS?
It would be simpler for me if you could post all questions using Reviewable, on the pull request, so we can discuss in situ.
Per @rdeits's comment in #8352:
My 2 cents:
virtualenv
does sound like a good option.I'm not sure if I'm a fan of our present re-inventing of
pip
in Bazel; understandably, it's for reproducibility and keeping most of the dependencies in Bazel, but seems like it may be brittle when the wheels are more complex or have more comprehensive dependencies (e.g.numpy
). We can always teach Bazel to scan and incorporate the upstream Python dependencies for ~hermetic-ness~ determinism with remote caching.Additionally, if the custom-dtype solution for #8116 ends up working, it would greatly smooth out having an optionally patched version of
numpy
for memory management.That being said,
virtualenv
is additional tweaking on the environment that does have a bit of a funny (and semi-invasive) smell to it. However, if it reduces headaches (e.g. if we can figure out how to teachnlopt
from Homebrew to not screw around with / conflict withpip
packgaes), I'm all for it.@jwnimmer-tri @jamiesnape Can I ask what your opinions are?
EDIT: Working with
virtualenv
will be a workflow similar to what we may want with ROS workspaces.EDIT 2: For now marking priority as
medium
. Can update if needed.\cc @calderpg-tri