Open RussTedrake opened 2 years ago
This is great! Are you imagining that EvalBodyPoseInWorld(BatchContext<T>, body)
would have a multi-threaded implementation under the hood, or run single-threaded?
Besides using Drake as a back-end for batch evaluations, we have a related, but slightly different use case where we would like Drake-native thread-safety for asynchronous calls by multiple threads for motion planning.
Currently we create multiple contexts and use a thread pool to hand off requests to the contexts. We've hidden all of this behind our own thread-safe constraint evaluation method.
I'm happy to detail our approach further if it would be useful. What have other folks done to solve this problem?
If there's community interest, we'd be like to contribute to Drake-native parallelization. If this is too tangential, I'm happy to make another issue.
If we use Eigen::Tensor
, then the same MultibodyPlant code could dispatch to evaluate serially, in parallel threads, or to the GPU, based on the way that the tensors are allocated to be passed in and the hardware available.
Using multiple Context
s and a thread pool is exactly the right thing to do. A good example of this in Drake right now is the monte-carlo code: https://github.com/RobotLocomotion/drake/blob/5316536420413b51871ceb4b9c1f77aedd559f71/systems/analysis/monte_carlo.cc#L42
I took Eigen 3.4's tensor module for a quick spin to see what it might look like to support batch evaluation in the systems framework. Short story: I'm highly encouraged by it, but I think Eigen alone will not be enough. It has enough support for batch matrix multiplication, but anything more fancy (e.g. batch matrix inverse) requires more; probably grabbing tensorflow's c++ math library is going to be necessary.
Getting batch evaluations (including GPU support) into the systems framework seems totally plausible to me now, and not anywhere near as jarring as I had feared. (I'm not saying that getting efficient GPU SAP solvers or collision checkers is easy... that's a very different question!)
Basic proposal:
systems::BatchContext<T>
which hasEigen::TensorFixedSize<N,rows,cols>
instead of vectors for the state, etc.LeafSystem
offers a default implementation of e.g.DoCalcTimeDerivatives(BatchContext,...)
that does a cpu-based for loop to callDoCalcTimeDerivatives(Context,...)
.LeafSystem
classes can opt-in by implementing the tensorized version of each method.MultibodyPlant::EvalBodyPoseInWorld(BatchContext<T>, body)
and friends would be immediately useful for motion planning. (and some of them don't require batch matrix inverse, etc, so could work with Eigen alone).I believe that the back-and-forth between
tensorflow::Tensor
andEigen::Tensor
is easy (and inexpensive).It would also be interesting/important to understand if
Eigen::Map<rows,cols>
could be efficient if the batch size isN=1
. In that case, perhaps we could change the existingContext
datatypes toEigen::TensorFixedSize<N,rows,cols>
and have e.g.get_continuous_state_vector()
return theEigen::Map
?