RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.28k stars 1.26k forks source link

SpatialVelocity() default ctor produces garbage, thus is a giant footgun #14086

Open EricCousineau-TRI opened 4 years ago

EricCousineau-TRI commented 4 years ago

Ran into an issue on Anzu where I messed up our --config=assertions CI job (see Anzu Issue 5462).

Issue was that I was using SpatialVelocity() instead of SpatialVelocity.Zero().

FTR I was expecting that the default ctor would be all zeros, b/c the surround code looked like this:

self._X_TPdes_command = fix_port(X_TPdes_input, RigidTransform())
self._V_TPdes_command = fix_port(V_TPdes_input, SpatialVelocity())

The solution was to change SpatialVelocity() to SpatialVelocity.Zero().

As a user, I would've expected that SpatialVelocity have a similar contract as RigidTransform / RotationMatrix, etc., that it's default-initialized to all zeros.

At a minimum, Python will have all zeros. But most ideal (esp. in terms of passing it back), C++ should also do so.

Steps:

FYI @sherm1

jwnimmer-tri commented 4 years ago

\CC @amcastro-tri as original author.

sherm1 commented 4 years ago

RigidTransform and RotationMatrix are not initialized to zeros, they are initialized to their respective identities. SpatialVelocity is just a SpatialVector and (like Eigen's vector types) should not be initialized in Release builds. Would be great to have them initialized to NaNs in Debug builds though.

I am currently struggling to remove the many hidden computational costs we've built into MBPlant that we're tracking in #13902 and don't want to see any new ones introduced. Please don't start initializing our std::vector<SpatialVelocity> objects; I want to be able to allocate them and then fill them in just once with meaningful values, not twice.

jwnimmer-tri commented 4 years ago

I wonder if we could have the garbage be opt-in, so in inner loops we can do my_velocities.resize(nv, SpatialVelocity::Uninitialized());, which is both very clear and allows us to make the default be on the side of safety and reproducibility. Expecting users to know which of Drake's types are ticking bombs is unfriendly.

EricCousineau-TRI commented 4 years ago

I agree with @jwnimmer-tri: If you're an expert and want to tune performance, take the explicit opt-in performance-tuning route. If you're a dev wanting to catch dumb uninitialized data, also opt-in. Otherwise, expect stuff to not explode?

And yeah, sorry, by "zero" I meant "identity" for SO(3) / SE(3) - oops!

sherm1 commented 4 years ago

Drake's SpatialVector is a raw numerical type just like Eigen's Vector6 and as numerical programmers we are using it that way. Initializing memory to zero is considered bad practice among numerical programmers, and I agree with that sentiment. Initializing to NaN in Debug is excellent practice and extremely helpful in revealing uninitialized junk. We're doing that intentionally here and you are proposing to undo that. I'm strongly opposed to that change.

There is no natural identity for a SpatialVector. Initializing to zero seems just arbitrary and potentially expensive to me. Drake has a serious performance problem specifically in the code that most heavily uses SpatialVector. I am working hard to fix that and this would just push my work backwards.

We've had lots of arguments on this point over the past few years and you are looking at the resolution -- there's nothing arbitrary about the initialization behavior of SpatialVector, RotationMatrix, or RigidTransform. If you want to make a different class that is more user friendly please do. But please don't change this carefully designed class which is behaving exactly as we designed it to do.

jwnimmer-tri commented 4 years ago

The challenge is, as a user, it is not easy to remind ourselves which classes' constructors provide reproducible defaults and which provide garbage. We don't all have memorized which types in Drake are intended for use by numerical programmers, and which are safe for mere mortals. It's easy to remember "Eigen doesn't initialize" because its a blanket statement. But most types in Drake do initialize to something -- the ones that don't are surprising. Re-reading the constructor documentation each time to remind ourselves doesn't scale well. I don't mind having fast-path matrix types for the inner loops, but perhaps they are not the best types to use for the novice APIs like SetFreeBodySpatialVelocity, input ports, etc.

sherm1 commented 4 years ago

I don't disagree with you that there could be a better way. However we have had extensive discussions to identify real Drake user problems and MBPlant performance is right at the top of the list. Redesigning SpatialVector did not make the cut. I want to move forward with the plan we have. Please don't undercut my work. After we have MBPlant running at a reasonable speed, we can revisit changes which could have a negative performance impact. We are a long way from reasonable at the moment.

EricCousineau-TRI commented 4 years ago

I've broadened this scope to general C++, but I'm going to address the acute Python portion because performance savings like this are nigh negligible in Python.

EricCousineau-TRI commented 4 years ago

We've had lots of arguments on this point over the past few years and you are looking at the resolution -- there's nothing arbitrary about the initialization behavior of SpatialVector, RotationMatrix, or RigidTransform [...]

For my own edification, is there a discussion, design doc, or something that motivates the difference between construction of RigidTransform, RotationMatrix, etc., vs. SpatialVector?

There is no natural identity for a SpatialVector. [...]

Er, but for the vectors that are present, acceleration, velocity, momentum, and force, zero seems physically relevant? I don't care who's responsible for the constructor, whether it's at the base with the assumption that zero is meaningful (to pave the way for other spatial vectors), or it's in each derived class.

jwnimmer-tri commented 4 years ago

The challenge is, as a user, it is not easy to remind ourselves which classes' constructors provide reproducible defaults and which provide garbage.

I've been thinking about this more.

For my own intuitions at least, the important mnemonic for guiding my understanding is having something like "array" or "vector" or "matrix" in the class name. When I see ndarray or Vector3d or the like, I know that the data type is akin to raw storage, so I know that by default the data is uninitialized.

The SpatialVector provides some sugar to have a Vector6 organized into rotational and translational halves while also providing static type checking. Under the rubric of the prior paragraph, users should not be surprised that a default SpatialVector is uninitialized.

For me, the challenge comes when we specialize SpatialVector into, e.g., SpatialVelocity. For those who have memorized the template inheritance that SpatialVelocity is-a SpatialVector, it's clear that the default is uninitialized. For someone happening onto a user API that talks about SpatialVelocity on its own without that inheritance in mind, it's a surprise that the class's default differs from Drake's other mathematical abstractions like RigidTransform. The classname suggests that we've upgraded from raw storage into a robust mathematical type, but really we're still in the land of raw storage.

I don't immediately have any ideas for improving the situation, but I hope at least that's a useful perspective on the impedance mismatch.

sherm1 commented 4 years ago

RigidTransform & RotationMatrix are mathematical objects with certain properties invariant. SpatialVector is used to hold physical quantities (velocity, acceleration, force, momentum) that are supposed to be calculated in accordance with some set of equations. The proper initialization for SpatialVector objects is NaN -- indicating that they do not currently hold such a physical property. Zero has the trappings of a legitimate quantity but is in fact arbitrary and therefore incorrect in general. Drake C++ already initializes to NaN in Debug and would do so in Release as well if we didn't have a performance issue. I would never support initializing these quantities to zero since it violates the prime directive of a physical code -- don't produce bad answers. A core dump is better than a bad answer! But a NaN expressing that a quantity has not yet been determined is the ideal initial value.

amcastro-tri commented 4 years ago

I didn't add to this discussion since I believe my point of view has already been expressed. But, if it matters, I'd prefer we do not change the behavior of the default ctor so that @sherm1 can continue his work without surprises. If we really really want a solution today, I'd advocate for a factory SpatialVelocity::Uninitialized() for performant critical code, and deprecate the default constructor. Then C++ and Python users would be forced to either use a factory like ::Zero() or some other constructor.

jwnimmer-tri commented 6 months ago

New proposal

(1) If a class's default constructor leaves memory uninitialized, then it cannot be part of Drake's public API.

(2) Having namespace-internal (i.e., internal-use-only) classes used inside MbT default to uninitialized is fine.

(3) Having a static factory function MyType::Uninitialized() on public API is fine.

EricCousineau-TRI commented 6 months ago

sgtm.

only other question is, is NaN right answer for default ctor, or should quantities like this initialize to zero / identity (similar to what we do for RigidTransform etc)? (feel free to punt / ignore if this is out of scope)

jwnimmer-tri commented 6 months ago

For the public API (where the user needs to supply an input to a function like SetFreeBodySpatialVelocity), possibly there should not be any default constructor. I think we would discover the right answer once we refactor the classes to split the MbT-internal footgun class with the MbP-public-API sugar class, and then we'll have a corpus of public call sites to understand what is or isn't necessary.