RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.35k stars 1.27k forks source link

Replace tkinter usage with a modern, lightweight alternative #14702

Closed jamiesnape closed 2 years ago

jamiesnape commented 3 years ago

Tcl/Tk is ancient, has a huge set of transitive dependencies (especially on Linux), is full of bugs (depending on version), and for various reasons is not available on Google Colab now (blame both Google and Ubuntu). There are plenty of other ways to render a couple of sliders and, honestly, me having to dig into Tcl/Tk when things crash that far down the stack on Python upgrades is not good value for money or a good use of anyone's time.

FYI @RussTedrake @jwnimmer-tri @EricCousineau-TRI.

EricCousineau-TRI commented 3 years ago

~I think maybe we'd be~ I am OK with just ripping it out for now, and not replacing it just yet?

(or does pygame depend on it?)

RussTedrake commented 3 years ago

I'm definitely fine removing it, would like to replace the existing uses either with pygame, or with jupyter notebook ipywidgets. If it's an immediate pain point, then I'm also open to simple removing them right now and having an issue to restore them via the other mechanisms.

jamiesnape commented 3 years ago

It is only a pain point if you import all, I think, at the moment. I forget exactly, I need to look at my fork. The real pain is probably six months out or whenever the next Python upgrade may be.

jamiesnape commented 3 years ago

(or does pygame depend on it?)

No, it uses SDL and you mostly write the GUI yourself with the event and rendering functionality. pygame is kind of overkill for this purpose, but we already eat the dependency. https://pygame-gui.readthedocs.io might save some boilerplate code, though I have no idea if it is any good (or if it is really that necessary).

EricCousineau-TRI commented 3 years ago

Gotcha! My question is less about "can pygame replace it?" and more about "will pygame break without it?".

Sounds like nah, won't break, so we're good to remove on that count?

jamiesnape commented 3 years ago

More the opposite, tk is more likely to break pygame if you tried to mix them for whatever reason.

jwnimmer-tri commented 3 years ago

If I understand correctly, this pain point led to the #14832 emergency hotfix, and will be no fun when it comes to #1183. I'll add this to our near-term worklist.

jamiesnape commented 3 years ago

Yes to both.

jwnimmer-tri commented 3 years ago

@RussTedrake we might start work on this soon. If you have any newer thoughts, please share.

RussTedrake commented 3 years ago

My comments above still capture my feelings -- i'm definitely fine to remove it and absolutely don't want it to hold up the pip work.

It happens that we're only one PR away from having in-browser sliders working in meshcat c++, which could be another replacement if we were ok dropping drake-visualizer support on those few tests.

jwnimmer-tri commented 3 years ago

Perhaps the smallest possible change to unwedge us here would be to exclude system_sliders and simple_ui from pydrake.all? That would leave then intact in the install, and so any users that did have a working Tk could use them, but users who do not have a working Tk (e.g., pip / manylinux) could still import pydrake.all and be up and running, just without those native widgets.

jwnimmer-tri commented 2 years ago

Here's a plan.

Short-term wheel cleanup / work-arounds:

Tasks:


The Tk-using class pydrake.manipulation.JointSliders is probably fully subsumed by pydrake.multibody.meshcat.JointSliders now.

Tasks:


The Tk-using class pydrake.manipulation.simple_ui.SchunkWsgButtons is probably best re-implemented using meshcat now.

Tasks:


The Tk-using class pydrake.systems.system_sliders.SystemSliders is not used anywhere in Drake.

Tasks:



Final pass.

jwnimmer-tri commented 2 years ago

Above, I say "immediately" which is a bit nebulous. I just mean "when we begin work on this ticket, whenever that might be".

The idea is that we can add the deprecation annotation right away, without doing all of the porting work first as a prerequisite of adding the annotation -- we can suppress the warning at the (few) current call sites, until they get ported.

BetsyMcPhail commented 2 years ago

@jwnimmer-tri can you point us towards the deprecation process documentation?

jwnimmer-tri commented 2 years ago

See #10605 -- there isn't an overall playbook yet.

For pydrake the general principle is to log a Drake-specific deprecation warning using our pydrake.common.deprecation module, see https://drake.mit.edu/pydrake/pydrake.common.deprecation.html for the module doc and https://drake.mit.edu/doxygen_cxx/group__python__bindings.html (under "Deprecation") for some tips. Something like @deprecated_callable on either the class or the constructor should do it.

When you've added it correctly, the unit test will start failing with an error (our tests opt-in this warning to be an exception), so then you add with catch_drake_warnings to nerf it. The test fixup has many examples already in-tree that you can cargo-cult.

jwnimmer-tri commented 2 years ago

To highlight what I communicated (I hope) last Thursday -- this is the most important issue on the board. To the extent we have people working on Drake hours this week, I need to see progress on this issue. If there's also progress on other topics, that's fine. But everyone who can usefully contribute here should be on deck, instead of working on other CI or packaging things. If the issue goes beyond your experience, then kick it back to me for TRI to implement, instead.

\CC @BetsyMcPhail @svenevs @mwoehlke-kitware

BetsyMcPhail commented 2 years ago

@svenevs is going to take the lead on this issue.

jwnimmer-tri commented 2 years ago

I think I missed a checkbox in the original grepping for tasks. We still need to port this code as well:

examples/manipulation_station/end_effector_teleop_dualshock4.py:25:from pydrake.systems.meshcat_visualizer import (
examples/manipulation_station/end_effector_teleop_sliders.py:31:from pydrake.systems.meshcat_visualizer import (
examples/manipulation_station/end_effector_teleop_mouse.py:17:from pydrake.systems.meshcat_visualizer 

Hopefully they are quick. It's probable that just deleting all of those options is best, and instead telling people to launch meldis to watch the simulation.

jwnimmer-tri commented 2 years ago

See also #17304 for a next step here. Apparently pyplot uses a Tk backend by default? In any case, the docs build requires Tk to succeed. That's probably a canary that none of our plotting works without Tk installed:

Python library code:

bindings/pydrake/systems/drawing.py:7:import matplotlib.pyplot as plt
bindings/pydrake/systems/planar_scenegraph_visualizer.py:7:import matplotlib.pyplot as plt
bindings/pydrake/systems/pyplot_visualizer.py:3:import matplotlib.pyplot as plt

Python example code:

bindings/pydrake/math_example.py:3:import matplotlib.pyplot as plt
bindings/pydrake/systems/drawing_graphviz_example.py:1:import matplotlib.pyplot as plt
bindings/pydrake/systems/test/pyplot_visualizer_test.py:4:import matplotlib.pyplot as plt
bindings/pydrake/visualization/test/plotting_test.py:3:import matplotlib.pyplot as plt
common/proto/call_python_client.py:187:    import matplotlib.pyplot as plt
examples/van_der_pol/plot_limit_cycle.py:1:import matplotlib.pyplot as plt

Notebook:

bindings/pydrake/multibody/examples/door_hinge_inspector.ipynb:28:    "import matplotlib.pyplot as plt\n",
tutorials/debug_mathematical_program.ipynb:36:    "import matplotlib.pyplot as plt\n",
tutorials/dynamical_systems.ipynb:193:    "import matplotlib.pyplot as plt\n",
tutorials/dynamical_systems.ipynb:280:    "import matplotlib.pyplot as plt\n",
tutorials/mathematical_program.ipynb:104:    "import matplotlib.pyplot as plt\n",
tutorials/nonlinear_program.ipynb:305:    "import matplotlib.pyplot as plt\n",
tutorials/rendering_multibody_plant.ipynb:42:    "import matplotlib.pyplot as plt\n",
tutorials/sum_of_squares_optimization.ipynb:89:    "import matplotlib.pyplot as plt\n",

We're not going to deprecate plotting, so we'll need to explore whether / how we can keep plotting working without Tk.

Hopefully the notebooks already use a non-Tk backend by default, and we can scratch them off immediately?

svenevs commented 2 years ago

Status update:

Linux the story is a bit more complicated.

Hopefully the notebooks already use a non-Tk backend by default, and we can scratch them off immediately?

I have not run all of them, but it does appear that notebooks are OK and my understanding is the agg backend is selected in the absence of any %matplotlib command. What would be a good idea is to make sure that all of them %matplotlib notebook.

Where import matplotlib.pyplot as plt for non-notebook code is concerned, we have a couple of options but I need to look more closely at our current direct vs transitive dependencies. For example, you can

import matplotlib
matplotlib.use("qt5agg")
import matplotlib.pyplot as plt

but AFAICT the Qt5 dependencies are really only for drake-visualizer and therefore slated for removal off in the future -- depending on python3-tk would be better than Qt long term in a footprint perspective. Our available focal backend choices are: ['GTK3Agg', 'GTK3Cairo', 'MacOSX', 'nbAgg', 'Qt4Agg', 'Qt4Cairo', 'Qt5Agg', 'Qt5Cairo', 'TkAgg', 'TkCairo', 'WebAgg', 'WX', 'WXAgg', 'WXCairo', 'agg', 'cairo', 'pdf', 'pgf', 'ps', 'svg', 'template']

Where either an *Agg or *Cairo backend is needed for code that opens a plotting window via plt.show(). This excludes agg and cairo. Next step is to diagnose in a docker image without using the setup scripts (tarball install), python3-matplotlib recommends python3-tk (the defacto default is tk).

Pending diagnosis, we would want to promote one of the possible transitive dependencies to be an actual dependency. That said, between gtk, Qt, Tk, and WX, Tk is probably the least intrusive overall choice so it may just be that we should keep python3-tk.

jwnimmer-tri commented 2 years ago

I'd be game for removing the Tk dependency on the macOS setup quickly, while we continue to look at Ubuntu. Disk space is generally more constraints on macs, so that will deliver value to our users even if the Ubuntu discussions take longer.

jwnimmer-tri commented 2 years ago

For Ubuntu ...

For common/proto/call_python_client.py and examples/van_der_pol/plot_limit_cycle.py, I think I'd be satisfied to advise the user to "BYO plotting back-end" and not have any install_prereqs for it. If we like, we can work to give then a nice suggestion for recovery in a nice error message ("apt install ...").

The remaining uses all seem to be in bindings/pydrake. For those, I'd be curious to learn which uses have trouble on Ubuntu with Tk missing. Some of them look like they might succeed without a backend. There might be some narrow path where we purge a little bit of code, move Tk to be testonly on Ubuntu, and have a couple of fail-fast guards in order to drop Tk has a require, runtime dependency.

svenevs commented 2 years ago

The reason we cannot detect this in tests is because of this

https://github.com/RobotLocomotion/drake/blob/a3978ac6fc7971030854132382e576d91fd08699/tools/bazel.rc#L48

Which means that even on a system without python3-tk installed all tests will succeed. I think it would be good to make a singular "plotting capable test" in drake such that import matplotlib.pyplot as plt fails if no "proper" plotting backend is available, but I'm struggling to figure out how to create such a test such that (on the current system) I would get

>>> import matplotlib.pyplot as plt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/matplotlib/pyplot.py", line 2349, in <module>
    switch_backend(rcParams["backend"])
  File "/usr/lib/python3/dist-packages/matplotlib/pyplot.py", line 221, in switch_backend
    backend_mod = importlib.import_module(backend_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/lib/python3/dist-packages/matplotlib/backends/backend_tkagg.py", line 1, in <module>
    from . import _backend_tk
  File "/usr/lib/python3/dist-packages/matplotlib/backends/_backend_tk.py", line 6, in <module>
    import tkinter as tk
ModuleNotFoundError: No module named 'tkinter'

I need a way to create a test that does not have the MPLBACKEND set. Is that possible?

mwoehlke-kitware commented 2 years ago

I need a way to create a test that does not have the MPLBACKEND set. Is that possible?

At worst, it seems like you should be able to munge os.environ before importing matplotlib?

svenevs commented 2 years ago

=> #17389

Good call @mwoehlke-kitware, should have posted a quick update saying that's possible / where things are going. I've now learned that ubuntu ships a matplotlib rc file in /etc/matplotlibrc that explicitly requests backend: TkAgg. As a result this is why apt-get install python3-matplotlib recommends python3-tk.

I think I'd be satisfied to advise the user to "BYO plotting back-end" and not have any install_prereqs for it. If we like, we can work to give then a nice suggestion for recovery in a nice error message ("apt install ...").

See the related test added in #17389, while we can test for it ... if we go this route then that test cannot remain. Mostly just using it to confirm macOS is good to go in one final run (works locally). If we remove python3-tk from the list of ubuntu installation requirements (but keep python3-matplotlib), users will not have a usable plotting backend when running drake's setup scripts.

I can go through drake's matplotlib.pyplot imports and wrap them in try-except indicating they should install python3-tk and remove the dependency if desired, but in my opinion it's best to just keep python3-tk around.

jwnimmer-tri commented 2 years ago

Please go ahead and land the macOS fix, then re-assign the ticket back to me. It will be more effective for me to just solve the Ubuntu side on my own than continue to try to communicate the subtleties of the requirements.

jwnimmer-tri commented 2 years ago

See #17438 for my proposal.

Essentially, there is only one place in Drake that uses pyplot interactively in any meaningful way (class PlanarSceneGraphVisualizer). Every other place only uses static figures. So, for PlanarSceneGraphVisualizer only we can defer the error to construction-time, in case the user has a broken Ubuntu installation (hard-coded to TkAgg but have not installed Tk). Everywhere else we just use Agg.