RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.36k stars 1.27k forks source link

Consider using virtualenv for managing upstream Python dependencies #8392

Open EricCousineau-TRI opened 6 years ago

EricCousineau-TRI commented 6 years ago

Per @rdeits's comment in #8352:

Also, problems with homebrew and system python conflicts could be completely eliminated by using python's virtual environments (which are even built-in on python3.5 and above). I can't recommend that highly enough.

My 2 cents:

virtualenv does sound like a good option.

I'm not sure if I'm a fan of our present re-inventing of pip in Bazel; understandably, it's for reproducibility and keeping most of the dependencies in Bazel, but seems like it may be brittle when the wheels are more complex or have more comprehensive dependencies (e.g. numpy). We can always teach Bazel to scan and incorporate the upstream Python dependencies for ~hermetic-ness~ determinism with remote caching.

Additionally, if the custom-dtype solution for #8116 ends up working, it would greatly smooth out having an optionally patched version of numpy for memory management.

That being said, virtualenv is additional tweaking on the environment that does have a bit of a funny (and semi-invasive) smell to it. However, if it reduces headaches (e.g. if we can figure out how to teach nlopt from Homebrew to not screw around with / conflict with pip packgaes), I'm all for it.

@jwnimmer-tri @jamiesnape Can I ask what your opinions are?

EDIT: Working with virtualenv will be a workflow similar to what we may want with ROS workspaces.

EDIT 2: For now marking priority as medium. Can update if needed.

\cc @calderpg-tri

jamiesnape commented 6 years ago

I am pretty much on record with hating reinventing apt and pip with Bazel, so it would make me happy.

EricCousineau-TRI commented 6 years ago

(For myself) A tiny breadcrumb for patching local PIP packages: https://stackoverflow.com/questions/5570666/patching-python-packages-installed-as-dependencies-with-pip

UPDATE: I believe I understand Robin's comments more now about virtualenv; it would make supporting both Python2 and Python3 much (read: a crap ton) easier for managing the switches. Rather than throwing a crap ton of Py3 duplicates in our Bazel code, we just change the environment (if needed).

jwnimmer-tri commented 6 years ago

On the subject of needing numpy fixes (not necessarily virtualenv or python2/3 stuff), I am not sure that I'm following, but my main concern is that in the same way libdrake.so needs to play well with the rest of the system libraries at runtime, because Drake is a library not a program -- pydrake needs to play well with the rest of the python environment at runtime. If we need to use non-default versions of things like numpy, how to we know that other code's use of numpy still works? I think we should go out of our way to be compatible with stable versions of common libraries, and not to require bleeding-edge versions.

jamiesnape commented 6 years ago

(For myself) A tiny breadcrumb for patching local PIP packages: https://stackoverflow.com/questions/5570666/patching-python-packages-installed-as-dependencies-with-pip

I am certainly not in favor of patching packages.

EricCousineau-TRI commented 6 years ago

Posted responses about package patching in #8116.

EricCousineau-TRI commented 6 years ago

I had seen some discussions on Slack about virtualenv, and ran into a blocking issue for #8452 that does require a two-line patch to numpy (still weighing cost/benefit, and I'm definitely not excited about the need for patching).

I'm not sure if we've considered it before, but there do seem to be some mechanisms to enable having a checked-in (or, I would assume, a generated) Python runtime, py_runtime and bazel build --python_top={target}: https://github.com/erain/bazel-python-example My assumption is that if we have a self-contained virtualenv (or use --system-site-packages), then we can create a repository to generate this environment, and point to the generated Python binary to use for the project. (Or permit an external virtualenv, and just use whatever binary and dependencies if people want to muck around with that.)

This could also pave the way towards containing support Python2+3 per #8352, most likely via a configuration switch.

jwnimmer-tri commented 6 years ago

I am not generally as worried about making our development and test environment work. I am most worried about how the installed pydrake interacts with the rest of the python universe. I think the clearest way to look at this question (and the wider python2/3 and forked-dependencies questions) is to first explain what the installed experience looks like, and then from there work out what the development environment looks like.

jamiesnape commented 6 years ago

Given that I am strongly against patching packages, I am not sure a virtualenv solves anything in the grand scheme of third-party Drake usage.

(My personal opinion is that whether to use a virtualenv should be the choice of the end-user and not something required.)

EricCousineau-TRI commented 4 years ago

Ran into another case of desiring this: Better ipywidgets support in #14082

FYI @RussTedrake

jwnimmer-tri commented 1 year ago

A motivating use case is on the rise -- Homebrew seems incapable of making a Python minor version transition without breaking all of non-first-party Python libraries as they go (e.g., recently scipy only works with 3.10, but the default python3 still points to 3.9).

It's probable that Drake on macOS should only use the Python interpreter from Homebrew, and for other Python ecosystem dependencies we use pip (with a lockfile).

jwnimmer-tri commented 4 months ago

Update: I have somewhat of a prototype of this now in TRI Anzu. After we gain some experience with that over the next week or two, I'll dump a snapshot of the code into this ticket, along with some more specific goals, and we can take it from there.

jwnimmer-tri commented 3 months ago

For kitware:

Here's the sample Anzu code to get started: anzu-snippets.zip.

Notes:

This should at least be able to supplant the source_distribution requirements management.

For binary_distribution/requirements.txt, bazel stuff doesn't help at all. We'll need to do something different there.

mwoehlke-kitware commented 3 months ago

I have no idea what an "anzu" is. I'm guessing all instances of that should be replaced with "drake"? If I do that, I can run the scripts, but it isn't clear how this is supposed to be actually consumed by other Drake bits that need the venv.

Also, if this is intended to replace what's currently pip installed, does that mean we'd drop the --[without-]test-only split?

jwnimmer-tri commented 3 months ago

I have no idea what an "anzu" is

It's the codename for TRI's internal git repository.

I'm guessing all instances of that should be replaced with "drake"?

Yes.

If I do that, I can run the scripts, but it isn't clear how this is supposed to be actually consumed by other Drake bits that need the venv.

That's this part:

Also needs changes to tools/workspace/default.bzl to call the new venv_repository rule. Use the venv as deps = ["@venv"] on a py_library or py_binary target.

The way to check is to find a test that needs something from pip in order to succeed, and then iterate to get that working.

Does that mean we'd drop the --[without-]test-only split?

Probably not. Users who are installing Drake from source (e.g. via CMake) should only by default need to pay the download cost of the non-testonly dependencies. The --with-test-only should (now) be opt-in, for use only by the Drake Developers (and Drake CI).

jwnimmer-tri commented 3 months ago

One update from f2f: the unique challenge of "default" vs "testonly" (versus what Anzu does) is to have sync know which choice should be used. The way we can do that is have install_prereqs.sh (really, probably venv/setup?) symlink the desired requirements.txt into or nearby the venv. Then sync can refer to the symlinked file, instead of hard-coding a specific requirements.txt.

mwoehlke-kitware commented 3 months ago

Okay, this one is puzzling:

FAIL: //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint (see .../execroot/drake/bazel-out/darwin_arm64-opt/testlogs/bindings/pydrake/common/py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint/test.log)
INFO: From Testing //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint:
==================== Test output for //bindings/pydrake/common:py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint:
Traceback (most recent call last):
  File ".../execroot/drake/bazel-out/darwin_arm64-opt/bin/bindings/pydrake/common/py/_test/serialize_test_bar.cpython-312-darwin.so_private_headers_cc_impl_cpplint.runfiles/drake/../styleguide/cpplint/cpplint.py", line 56, in <module>
    import six
ModuleNotFoundError: No module named 'six'

It seems like the correct patch (ignore the likely style error) is:

diff --git a/tools/workspace/styleguide/package.BUILD.bazel b/tools/workspace/styleguide/package.BUILD.bazel
index 03b5770120..8001a7291d 100644
--- a/tools/workspace/styleguide/package.BUILD.bazel
+++ b/tools/workspace/styleguide/package.BUILD.bazel
@@ -22,6 +22,7 @@ py_binary(
     python_version = "PY3",
     srcs_version = "PY3",
     visibility = [],
+    deps = ["@venv"],
 )

 alias(

...but I'm still getting the above error.

mwoehlke-kitware commented 3 months ago

Another issue... since we're only doing macOS for now, just adding deps=["@venv"] breaks Linux. I suppose there's a way to say "this is only a dep on macOS", but that's a bunch of extra work that we expect to eventually remove. Is there instead a way to make @venv a stub (for now) on not-macOS?

jwnimmer-tri commented 3 months ago

It would be simpler for me if you could post all questions using Reviewable, on the pull request, so we can discuss in situ.