Closed BetsyMcPhail closed 1 year ago
To make sure that we all understand this ticket -- the victory condition here is that a user of pip install drake
is able to easily load the models & meshes for all of Drake's included robots and manipulands, e.g., the iiwa, atlas, ycb objects, etc., with the expectation that under the hood we're fetching them from urls. (At the moment, we've had to exclude certain model files from the whl because they are too large.)
Is that accurate?
The exact implementation details still need to be worked out but that is my understanding
Relates to #15024 somewhat.
Relates to #13942 and #15024 somewhat, and #9498 and #11913 more directly.
Another comment (channeling Russ) -- something like https://pytorch.org/tutorials/beginner/basics/data_tutorial.html may be the best solution here. Either that library directly, or some Drake-compatible simpler implementation.
https://packages.ubuntu.com/source/bionic/ros-resource-retriever might be useful.
For better issue search, here's the list of mesh file paths that are excluded from the wheel:
rm -rf \
${WHEEL_DATA_DIR}/manipulation/models/franka_description/meshes \
${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.obj \
${WHEEL_DATA_DIR}/manipulation/models/tri-homecart/*.png \
${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.obj \
${WHEEL_DATA_DIR}/manipulation/models/ur3e/*.png \
${WHEEL_DATA_DIR}/manipulation/models/ycb/meshes \
${WHEEL_DATA_DIR}/examples/atlas \
${WHEEL_DATA_DIR}/examples/hydroelastic/spatula_slip_control
I'm running into this issue when trying to load models (@jwnimmer-tri redirected me to here from my stackoverflow question. I'd be happy to help with making a PR for this issue if it's not actively being worked on.
@SwappyG thanks for your interest!
This feature will require a relatively large chain of multiple pull requests (some of which will be quite intricate), as well as corresponding changes to Drake's release process and binary distribution architecture.
I don't say that to discourage you, rather to say that it will be a somewhat involved process, and therefore a somewhat difficult starting place for a first-time contributor, especially with the overall software design for this feature not yet finalized.
My thought here is that I'm going to work on writing up a software design for how this is all supposed to work. I'll probably also need to hack together a prototype to show that the design is practical. I'll post those ideas into the ticket here, at which point anyone interested is welcome to help push it forward.
Here's my thinking towards a design...
Background:
Currently, we have https://github.com/RobotLocomotion/models for large model files. Per #11913 and #13942 we might relocate more files from drake
into models
, but in any case having the too-big-to-install data in models
instead of drake
is a precondition assumed by this design.
Currently, at build-time we're using a custom bazel rule forward_files()
to basically download and copy the models
files into the drake
build directory, so that we can seamlessly pretend as-if they are part of the drake
source tree.
Proposal:
(1) Step away from the forward_files
idea, by adding a package://drake_models
.
Add package.xml
to https://github.com/RobotLocomotion/models. Update our SDFormat/URDF references to cite the new package, e.g., drake/manipulation/models/ycb/sdf/003_cracker_box.sdf
would change from citing <uri>package://drake/manipulation/models/ycb/meshes/003_cracker_box_textured.obj</uri>
to <uri>package://drake_models/ycb/meshes/003_cracker_box_textured.obj</uri>
.
(2) Add the drake_models
package to our PackageMap
by default (i.e., in addition to the drake
package), so that all of the models continue to load out-of-the-box.
(3) Stop incorporating any of the models
repository into Drake's install rules, thereby effectively removing those files from our pre-compiled binaries as well.
(4) In source builds, the PackageMap
entry for drake_models
will refer to the download fetched by bazel (i.e., basically no change from today).
(5) In pre-compiled builds, the PackageMap
entry for drake_models
will map to a URI, instead of a filesystem path. The first time any model file is requested, the URI will be downloaded into a temporary folder and re-used from that point on. Users could add entries for other URLs also -- in case they want to load models from web servers as well.
(6) Maybe docker images could have the drake_models
data already pre-fetched on disk? Or maybe easiest to just keep them the same as everywhere else. In any case the pip
and tgz
will drop the model data; probably apt
as well.
(7) We could have a "prefetch" post-install script that any user could run from any install mechanism, to download the models and place them somewhere the package map would find them by default, with no ongoing downloading.
(8) Some users might balk at the idea of Drake hitting the internet by default. The download will at least need to have an opt-out config setting; possibly it needs to be opt-in. Possibly we could obey the default environment variable for proxying (http_proxy
IIRC).
Miscellany:
(a) Drake already build-depends on libcurl
; we would use that for the downloading. We might need to activate its https
support. (IIRC, we are http
only so far.)
(b) I'm not sure whether the package map URL should refer to a base url (where we could grab files one by one) or an archive (that we'd download all at once and then decompress).
(c) Should the downloads rely on https to certify the file, or a sha256 (or 512) checksum, or both?
(d) We need a careful mechanism to keep the model
repository pinned and mirrored, possibly with updates to the release playbook.
(e) Anything in models
will no longer be accessible to FindResourceOrThrow
. That means if we move e.g. the IIWA URDFs to models
as proposed in #13942, the only way to load them will be as URIs from the package map, not as bazel resources. This is probably better in any case. Users are confused by FindResource
stuff.
@jwnimmer-tri I think I understand most of the proposal. Making the models repo into a package xml + changing URIs to point to that seems like the lowest risk task for a first-time contribution. Does that seem accurate?
(1), and maybe (2), (5) and (7) seem like something I could help with, though I don't know too much about certifying files as mentioned in (c).
Related to the discussion above, I had at least one very reasonable question about this from a student: "I'd be happy to just download the model repo separately into a subdirectory of Drake (e.g. after a pip install). Why do you make it so hard to just grab the files? Your bazel script puts them everywhere."
I'd be happy to just download the model repo separately into a subdirectory of Drake (e.g. after a pip install).
The https://drake.mit.edu/from_binary.html downloads have all of the model files, in their expected relative locations.
For my reference:
See $XDG_CACHE_HOME
per https://wiki.archlinux.org/title/XDG_Base_Directory, for our temporary downloads.
Per https://github.com/pypa/packaging-problems/issues/64, it seems like there is not any way to populate the cache at install-time, only upon first use.
A few thoughts from f2f chat today:
Working towards #1183
Follow up to #15628