Open SeanCurtis-TRI opened 3 months ago
Per slack, we might attack this in two stages:
(1) Fix the _GeometryFileHasher
to know how to chase gltf uris.
(2) Refactor Meldis to lean on Meshcat::SetObject
to report back exactly which files were relevant, so we can delete the bespoke Python logic.
Stage one is in flight with #21297.
I think stage two requires a bit more discussion.
It's certainly reasonable to modify the return value of void Meshcat::SetObject(string_view, const Shape&, const Rgba&)
to return a vector of paths. But what's unclear to me is how that benefits meldis
.
Meldis
would only find out what files were necessary after dispatching objects to meldis
. Sure, it can save those paths and create a hash on what was already set into Meshcat
. However, what good does that do when it gets the next load message? Knowing what's currently in meshcat doesn't help in interpreting the incoming message. If the only way to do that is set things into meshcat and ask what got set, hasn't it already done the work?
I can imagine the hasher creating its own Meshcat
instance with a functionally non-communicative port so it can blindly register things and harvest the result. After all, it would really only ever need to register meshes, not everything in the message.
Alternatively, we can try to refactor the code in Meshcat
so that the logic for assembling the family of files associated with a mesh is independent of what is done with that information? And then that core functionality can be used by a meshcat
instance to "peek" but not "act".
Can you elaborate on what you had in mind?
If the new load message is the same as the old load message, and all of the files used in the prior load message still exist and are unchanged, then we know for certain that no reload is necessary. So all we need to do is keep track of the prior load message's overall list of files and their hashes (or maybe one overall hash), and that's all we need.
File existence should not be the arbiter of whether the load message matches. I can remove a reference to an image without deleting the image. Furthermore, I can add an image to a model and it won't be captured by what I previously loaded.
I'm leaning toward simply creating an API that takes a Mesh
and returns a list of file info (at its simplest, just paths). That would also simplify what Meshcat
does to handle objs. Ultimately, it would localize functionality that is currently distributed across Meshcat
and FileStorage
and make it accessible for anyone who is curious for whatever reason.
If you remove the reference to an image (or add one), the checksum of the gltf (or whatever) that mentions that image will change.
I am pretty convinced that the right Meldis approach is "Meshcat tells us what files it touched" since that will very clearly be correct and never diverge as the code chnages, instead of having two different control paths where we hope that Mesh.tell_me_files()
matches what Meshcat did. For all we know, Meshcat might eventually do things that go beyond what Mesh.tell_me_files()
knows about, or ignore e.g. images that are for gltf extensionsUsed that are always ignored on the JS side that Mesh.tell_me_files()
would need to return in general.
Touche. I keep forgetting hashing the original file.
So, to summarize:
None
-valued self._hash
value).self._hash is not None
)
self._hash
, we're doneself._hash
On the subject of "subtracting bytes from a hash value":
One possible solution is to replace a single aggregate hash value with a set
of hash_values (on per file). Instead of comparing aggregate hash values, we compare sets of values. This would handle both cases where the set of files have changed (new files now being used and old files no longer used but not deleted).
The flavor I imagined was that Meshcat.SetObject would return[1] the list of all relevant files, including the Mesh/Convex filename. I imagine that simplifies everything -- Meldis would not need to do anything with the load message's filename other than pass it to SetObject and then remember the list of files that comes back. We could promise the Mesh.filename() was first in the returned list, in case that helps anything. I'm not wedded to that though -- if for some reason treating the Mesh.filename() differently made the code cleaner then I'd be fine with that.
I'm open to the idea of multiple hash values, but my guess is that a single overall hash still ends up simpler. The "did any of these set of hashes change" versus "did the combined hash over this set of hashes change" seem equivalent. Again, though, I'm willing to trust however the simplest code comes out, whatever that looks like.
[1] ... or output argument, or return struct with named fields for future expansion, or ... etc
I've simplified from what I previously wrote and will be submitting a PR soon. Ultimately, I'm trying to minimize the amount of hashing I'm doing. The goal is two fold:
One resolution, obviously, is to remove the second requirement. But that just seems inelegant to me. But we can move this conversation over to a concrete PR in a bit.
What happened?
Meldis's
_GeometryFileHasher
providesmeldis
the ability to recognize if a model has changed on disk since it was last loaded and intelligently reload (by considering the set of files that define the model). However, that logic is heavilyobj
-centric. Changes to glTF data on disk do not trigger a model reload.Now that
Meshcat
(the Drake class, not the javascript library) has the intelligence to run down all linked assets, that should be exploited in_meldis.py
so that the right set of assets are included in the hashing.Version
No response
What operating system are you using?
No response
What installation option are you using?
No response
Relevant log output
No response