Open david-german-tri opened 7 years ago
I like this a lot. I also have a Value<MyBaseClass>
update to the Value
erasure design in #5469. We should probably coordinate the design of two fixes, at least insofar as to deconflict them.
From my tour while fixing #5431, T
-dependent abstract ports or state are more common that you might think; we might have to fix this soon, depending on how much use transmogrification ends up getting.
Reassigning to @sherm1.
(1) To the contrary of System's API documentation and the statement above that double
is always the starting type for transmogrification, the current transmogrification APIs allow users to go from any System scalar over to AutoDiffXd
or symbolic::Expression
, i.e., the base system doesn't have to be double
. This is actively used, e.g., for the feed-through sparsity of an System<AutoDiffXd>
where we (try to) transmogrify it to symbolic::Expression
. We should agree on the supported conversion operations before going too much further here. Either we support the full U -> T matrix of conversion, or we support all double -> T and T -> double (and then transmogrify twice for AutoDiffXd
sparsity).
(2) A plausible design for an AbstractValue
transmogrifier might look like the functor approach in https://github.com/jwnimmer-tri/drake/tree/framework-transmogrify-dry. Instead of virtual methods, we could use std::function
to provide late-bound dispatch for how to manipulate the values.
It may be time to resurrect this. I've been trying to transmogrify GeometrySystem
from double
to AutoDiffXd
and ran face first into this issue. @jwnimmer-tri your linked branch is no more. But I assume it's something akin to what you did with the system scalar converter, yes?
@jwnimmer-tri your linked branch is no more. But I assume it's something akin to what you did with the system scalar converter, yes?
The framework-transmogrify-dry
branch is fully merged to master now, and yes it was the SystemScalarConverter-based design.
What is the exact problem you have, though?
Values in the Context
do not transmogrify, they use SetTimeAndStateAndParametersFrom
. Do we need AbstractValue::SetFrom
to handle disparate scalar types (possibly via a differently-named method), so that AbstractValues::CopyFrom
works?
Or if your Value is a member field instead of in the Context, then I would not expect to need a generic solution within the framework for handling it; the GeometrySystem could directly implement the relevant logic.
You've captured the exact problem.
I have abstract value in my context. It is scalar-type dependent. Which means AbstractValue::SetFrom
fails (source being one scalar type, destination being another...but the effort is to cast the source into the destination type). I'd like to not get a bad_cast
error. Furthermore, I'd like to have values meaningfully transferred.
Understood.
If you need hack to work around locally, while you explore other things, it would look like:
class TransmogrifyingGeometryThingValue<T> : public Value<GeometryThing<T>> {
// override Clone in order to preserve the type of this
// override SetFrom and SetFromOrThrow to check for other being a
// GeometryValue<T>(double), and handle that specially; otherwise
// delegate to the base class
};
As far as I can perceive right now, the non-hack answer would be to add virtual void AbstractValue::SetFromDoubleVariantOrThrow(const AbstractValue& other) = 0
and then allow the T
types in Value<T>
to optionally implement that, and teach Context to use it.
However, that there is T-dependent State in a Context to transmogrify is slightly questionable (and why we haven't hit this question yet in practice). We should be sure that we know how we expect users to correctly establish this particular State's AutoDiff partials before moving too far along a desigh path. If the partials ever need to be non-zero, and aren't recomputed from the inputs, parameters, or upstream outputs, then it will be impossible for generic optimization tools to work with GeometrySystem.
I may run with the TransmogrifyGeometryThingValue<T>
in the short term.
In the longer run, I'd definitely like to have the conversation about whether abstract state can contain scalar-dependent values. At first glance, the obvious alternative is to pull it out of the abstract state and put it in the discrete state. If we assume that the non-numerical data and numerical data currently in the abstract state are in some sense meaningfully coupled, then I see the following downsides:
Generally,
One point worth noting is that some of this will be alleviated with caching. Currently, I've got some ad hoc cache-substitutes to get things working. When these values end up in cache, the problem might be completely alleviated (I haven't fully thought it through yet). Although, I suspect it can't be gotten rid of 100%.
FWIW I think we should allow custom transmogrification of abstract states. I agree with Sean that trying to shoehorn everything numerical in to discrete state is too awkward.
I am OK with not requiring folks to shoehorn their stuff into discrete state. The question remains to be answered how AutoDiff is useful in this case, when the framework is unable to initialize the partials generically. That answer will govern exactly how the T-dependent but non-Vector Values in a Context get transmogrified.
@jwnimmer-tri, can you clarify what you mean by "initialize the partials generically"? Are you referring to not being able to size the derivative vector for AutoDiffScalars?
The dimension I am not worried about. For fixed-size it's already OK; for AutoDiffXd empty and resized-to-some-exact-width-of-zeros are equivalent.
The question is how we place the magic 1.0
in the derivatives vector of each thing you are differentiating against. Does everything in abstract state start with zero for all of its partials, always? If so, why do we believe that to be true? If not, how do we establish the 1.0
indices when we initialize the AutoDiff-dependent Context?
Surely the problem is no different for any other state variable. If I want to take the partials with respect to any state (as opposed to input parameters), I have the obligation of digging into the mutable state and setting my 1.0
at the appropriate site. So, is there a current expectation that no one would do that?
The difference is that numeric (i.e., scalar-valued) state is exposed in a way that we can access from System
and Context
directly, without knowing anything specific about the System subclass being studied. For abstract state, we don't have any APIs for that yet (I think). I suspect those APIs are a critical element of how this gets solved.
That feels a bit misleading. It's one thing to have an API to expose a state vector. It's another to know remotely what it means. Surely, the interpretation of the state vector is no more arduous than knowing what the abstract state is? (And one might argue that the abstract state is easier because it isn't vector-encoded.)
Consider, for example, Linearize. Given a System, it linearizes it about an operating point. It does that without knowing the details of the System. Yes, the caller of Linearize has to know what the state indices mean, but Linearize's implementation doesn't -- it just needs to set each numeric state variable's partials vector to have 1.0 in the right place, for all of them. In other words, it needs to enumerate and index all T-dependent state variables; it doesn't need to understand them beyond counting them.
That makes sense. It's very blind and methodical. The operation is a syntactic one and not a semantic one. And it's not clear that there is a similarly blind operation for abstract state. (And all things being equal, it would be nice to avoid that kind of API; the one I can immediately imagine feel unwieldy.) Thanks for the clarification.
So, I really need to be concrete about what role the numerical value in my abstract state. Most of it, I know, is cache-place holder which renders it moot for this conversation. But I have the sense there's one thing that is meaningfully justified...I'll look into it more deeply.
Consider, for example, Linearize ...
Our Linearize method is currently defined to produce ∂y/∂[u,xₙ] where xₙ includes the continuous and discrete (numerical) state variables. It doesn't promise to provide ∂y/∂p (parameters) or partials w.r.t. abstract states. I think that's a nice interface -- anyone who wants to linearize with respect to other things can do so by writing their own linearization method.
The problem I see, however, is that the calculation of ∂y/∂[u,xₙ] would be wrong if there are hidden numerical values among the abstract states that don't get transmogrified. Currently I believe that is prevented by Linearize refusing to operate on systems that have abstract state. If we want to relax that assumption, we need the numerical members hiding in abstract state to be transmogrified correctly. But I don't see why we would need to extend the Linearize API to compute a wider range of partials. Just getting ∂y/∂[u,xₙ] correct in the presence of internal xₐ variables would be a nice extension.
A developer who actually wants ∂y/∂xₐᵢ for some particular numerical member of an abstract state can obtain that, they just can't expect to do it with our Linearize method.
If we have good cause to believe that partials within abstract state should always start at zero in all cases of our generic toolbox algorithms such as Linearize, then I agree we can move forward solving only the problem of how to shovel the double
from a Context<double>
's abstract state into the Context<AutoDiffXd>
's abstract state's .value()
term. You should probably be sure @RussTedrake is convinced on that front, before going further.
Once we're convinced, then it sounds like we'd all agree that ...
Do we need AbstractValue::SetFrom to handle disparate scalar types (possibly via a differently-named method), so that AbstractValues::CopyFrom works?
... is the relevant question, and so I'd say that adjusting or twinning AbstractValue::SetFrom
to meet the need is the right approach.
It's sometimes useful to store
MyClass<T>
in anAbstractValue
input, output, or state variable, whenMyClass
has scalartype-dependent elements but no vector structure. One example isPoseBundle
, or the futureGeometryQuery
.On master right now in the System framework, there are various features that transmogrify vector-valued Context quantities from
<double>
to<T>
, whereT
is e.g.symbolic::Expression
orAutoDiffXd
. The concept is that numerical simulations may periodically want to stop and do analysis of some form with a richer scalar type, and there is always a valid upconvert from double to any otherT
.These features currently assume that an AbstractValue is an impenetrable black box, and Clone it. For AbstractValues that don't depend on the scalartype, that's fine. For AbstractValues that do, it's likely to blow up at runtime, for instance when a transmogrified system attempts to down-cast the AbstractValue to
MyClass<symbolic::Expression>
.We could imagine making
AbstractValue
a little bit penetrable, so that scalartype-dependent values could describe how to transmogrify themselves.