what's the best way to template our dynamical system class hierarchy methods?

RussTedrake commented 9 years ago

I'm working on an initial C++ version of our systems class hierarchy. We have a lot of templates flying around, and this is a complicated use case. I'd like to discuss here to make the right decision.

Consider this example, which works fine:

class DrakeSystem {
public:
  virtual VectorXd dynamics(double t, const VectorXd& x, const VectorXd& u)
};

class Pendulum : public DrakeSystem {
public:
  virtual VectorXd dynamics(double t, const VectorXd& x, const VectorXd& u)
};

Now I want to template it, so that it can

accept a TrigPoly and/or AutoDiffScalar type (eg, something like Matrix<Scalar,Dynamic,1>)
bonus: take in an Map<VectorXd> instead of the VectorXd

I'm not 100% sure I understand the best solution even in the base case, but the fact that these are (at least conceptually) virtual methods complicates everything.

I've played a little with templating the class instead of the methods -- that might work for us and could even be relatively clean.

Is there an obviously correct solution for this?

RussTedrake commented 9 years ago

Another very important metric to add to the list is

code should be readable/maintainable

i think our existing code is failing in that regard.

Here's one potentially contentious idea: wrap all of the templated stuff in our own class, and pay the small overhead for runtime dispatch. E.g.

Eigen<double,-1,-1> is a DrakeMatrix
Matrix3d is a DrakeMatrix
Eigen<TrigPoly,-1,-1> is a DrakeMatrix
Eigen::Map<MatrixXd> is a DrakeMatrix
...

then we have one .cpp file with a crapload of inlined dispatch methods (sin, cos, times, ...) and explicit instantiations. but the rest of the code stays clean.

psiorx commented 9 years ago

We'll probably have to do something similar to the CRTP if we want to have templated virtual methods. It's not the most readable solution but it works.

I remember brainstorming about the boxed type idea (DrakeMatrix) back in DRC and I still think it could be a good idea. If we can't get around using templates, we can at least abstract the messy part away.

If you template the class instead of the methods, you won't be able to mix and match calls to methods on the same object with different scalar types without reinstantiating. That could be fine, though.

tkoolen commented 9 years ago

I'm beginning to think that templating at the class level is the way to go. If you want to emulate templated virtual methods, you have to do the CRTP + interface thing we did for DrakeJoint, which will be a hassle at a large scale. Templating at the class level would make us lose the ability to call methods with different scalar types on the same object, but I don't think that is a common use case. I don't think it's terrible to have both a RigidBodyManipulator<TrigPoly> and RigidBodyManipulator<double>.

As for handling Map<...> etc, that can be done with Ref. I would propose this:

template <typename Scalar>
class DrakeSystem<Scalar> {
public:
  typedef Matrix<Scalar, Dynamic, 1> VectorX;
  virtual VectorX dynamics(double t, const Ref<VectorX>& x, const Ref<VectorX>& u)
};

template <typename Scalar>
class Pendulum : public DrakeSystem<Scalar> {
public:
  virtual VectorX dynamics(double t, const Ref<VectorX>& x, const Ref<VectorX>& u)
};

I'm not a fan of DrakeMatrix. First, you'll end up reimplementing all Eigen methods. Suppose you want to take the 3x3 upper left corner of a DrakeMatrix. Now DrakeMatrix needs to define that method. Same thing for every single method of an Eigen matrix. More importantly, how do you envision this 'is a' relation? It can't be inheritance because e.g. Matrix3d is not in our codebase; we can't just add a superclass on top. So it would have to be composition. So now you have to make DrakeMatrix3d, which has a Matrix3d as a member, along with a DrakeMatrixXTrigPoly, DrakeMapXd, and so on. Maybe 30 types. And each one needs to define e.g. operator*=(const DrakeMatrix& other). How are you going to implement that? The data member will not be visible at the DrakeMatrix level... You could think of having DrakeMatrix define a toMatrixXd() method, going to a common type, and then implement the methods in terms of that, but actually that's not even possible since Matrix<TrigPoly, ...> wouldn't work then. And you would absolutely murder all efficiency by copying everything to a new MatrixXd before any operation.

tkoolen commented 9 years ago

Another possible way around the lack of templated virtual methods could be to use CRTP, like Eigen.

So you could have a DrakeSystem bass class that looks (very much simplified) like this:

template <typename Derived>
class DrakeSystem {
private:
  Derived& derived;

public:
  DrakeSystem(Derived& derived) : derived(derived) {};

  template <typename DerivedX, typename DerivedU>
  PlainObjectBase<DerivedX> dynamics(double t, const MatrixBase<DerivedX>& x, const MatrixBase<DerivedU>& u) const {
    // do compile time size check on x and u, then
    return derived.dynamics(t, x, u);
  }
}

and then Pendulum could look like

class Pendulum : public DrakeSystem<Pendulum> {
public:
  Pendulum() : DrakeSystem<Pendulum>(*this) {};

  template <typename DerivedX, typename DerivedU>
  PlainObjectBase<DerivedX> dynamics(double t, const MatrixBase<DerivedX>& x, const MatrixBase<DerivedU>& u) const {
    // do the actual computation of xdot, then return it
  }
}

and Cascade would become templated as follows

template <typename Sys1, typename Sys2>
class Cascade : public DrakeSystem<Cascade<DrakeSystem<Sys1>, DrakeSystem<Sys2>>> {
private:
  shared_ptr<DrakeSystem<Sys1>> sys1;
  shared_ptr<DrakeSystem<Sys2>> sys2;

public:
  Cascade(shared_ptr<DrakeSystem<Sys1>> sys1, shared_ptr<DrakeSystem<Sys2>> sys2) :
    DrakeSystem<Cascade<DrakeSystem<Sys1>, DrakeSystem<Sys2>>>(*this),
    sys1(sys1), sys2(sys2) {};

  template <typename DerivedX, typename DerivedU>
  PlainObjectBase<DerivedX> dynamics(double t, const MatrixBase<DerivedX>& x, const MatrixBase<DerivedU>& u) const {
    // pretty much the same as the current implementation in your branch
  }
}

RussTedrake commented 9 years ago

If I had to pick immediately, I would say that’s probably the right way to go for performance. I think having our own vector/matrix type with derived classes for eigen, eigen, etc, and hitting an inlined vtable lookup for every basic operation (add, multiply, etc) would be slower but the code would be a lot cleaner.

On Aug 23, 2015, at 12:46 AM, Twan Koolen notifications@github.com wrote:

Another possible way around the lack of templated virtual methods could be to use CRTP, like Eigen.

So you could have a DrakeSystem bass class that looks (very much simplified) like this:

template class DrakeSystem { private: Derived& derived;

public: DrakeSystem(Derived& derived) : derived(derived) {};

template <typename DerivedX, typename DerivedU> PlainObjectBase dynamics(double t, const MatrixBase& x, const MatrixBase& u) const { // do compile time size check on x and u, then return derived.dynamics(t, x, u); } } and then Pendulum could look like

class Pendulum : public DrakeSystem { public: Pendulum() : DrakeSystem(*this) {};

template <typename DerivedX, typename DerivedU> PlainObjectBase dynamics(double t, const MatrixBase& x, const MatrixBase& u) const { // do the actual computation of xdot, then return it } } and Cascade would become templated as follows

template <typename Sys1, typename Sys2> class Cascade : public DrakeSystem<Cascade<DrakeSystem, DrakeSystem>> { private: shared_ptr<DrakeSystem> sys1; shared_ptr<DrakeSystem> sys2;

public: Cascade(shared_ptr<DrakeSystem> sys1, shared_ptr<DrakeSystem> sys2) : DrakeSystem<Cascade<DrakeSystem, DrakeSystem>>(*this), sys1(sys1), sys2(sys2) {};

template <typename DerivedX, typename DerivedU> PlainObjectBase dynamics(double t, const MatrixBase& x, const MatrixBase& u) const { // pretty much the same as the current implementation in your branch } } — Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1273#issuecomment-133782163.

tkoolen commented 9 years ago

It's not just the vtable lookups that are going to kill performance. It's the fact that you have to convert everything to a common type before doing any operation. The only way to avoid a copy to a dynamic size version of your matrix before every operation is to only allow dynamic size matrices in the first place. This kills a lot of efficiency, especially for the small matrices we use a lot. Compile time size checks are gone. Blocks of matrices will be copies, and you'll lose the ability to write to blocks. Maps will be impossible. If we care that little about performance, why are we using C++ and Eigen in the first place?

And it's still unclear to me how you would handle the different scalar types. Between double, AutoDiffScalar<...>, and TrigPoly, there's no common scalar type that you can convert to. What is the signature of operator+ going to look like in the base DrakeMatrix class? At that level, all you know is that it's going to take a const DrakeMatrix& as an argument. The subclass DrakeTrigPolyMatrix needs to implement that method, but how? The return type is going to depend on the type of the input argument, but all you know is that it's a const DrakeMatrix&. And how about operator+=? You would need to change the type of this if this is currently a DrakeDoubleMatrix and you're adding a DrakeTrigPolyMatrix.

So I don't even think it's only about performance, it's just not feasible.

RussTedrake commented 9 years ago

I think the common base class case could probably be made to work (i think that's what people are calling "type erasure" -- they argue that worrying about the vtable reference overhead is a premature optimization. And they hold up boost::any as a shiny example of it in action). But I’m still not sure that it’s what we want.

The main fundamental reason that you can’t template a virtual method is that the compiler would not know how to write the vtable for arbitrary types. Unlike the standard template case, though, I would say that we have the luxury of knowing our types at compile time. So we can help the compiler out. Here’s my new proposal

#include <iostream>

using namespace std;

class Multiplier {
public:
  virtual int     output(int a)     { return output_implementation<int>(a); };
  virtual double  output(double a)  { return output_implementation<double>(a); };

private:
  template <typename Derived> Derived output_implementation(Derived a) {
    cout << "Multiplier " << typeid(Derived).name() << endl;
    return 2*a;
  }
};

class TimesThree : public Multiplier {
public:
  virtual int     output(int a)     { return output_implementation<int>(a); };
  virtual double  output(double a)  { return output_implementation<double>(a); };

private: // so that it does not "hide" the base class implementation
  template <typename Derived> Derived output_implementation(Derived a) {
    cout << "TimesThree " << typeid(Derived).name() << endl;
    return 3*a;
  }
};

int main() {
  Multiplier* m2 = new Multiplier();
  Multiplier* m3 = new TimesThree();

  int a = 2;
  double b = 2.2;

  m2->output(a);
  m2->output(b);
  m3->output(a);
  m3->output(b);

  delete m2;
  delete m3;
  return 0;
}

which gives the output

/Users/russt/Library/Caches/clion11/cmake/generated/261f7a3/261f7a3/Debug/template_virtual_methods
Multiplier i
Multiplier d
TimesThree i
TimesThree d

I like this for a few reasons:

it accomplishes the goal of writing each method once and having it work for multiple types
it is precisely doing the explicit instantiations, imho in a cleaner/better way. we were going to have those sitting at the bottom of the class file if we don’t effectively have them here.
we can make the syntax clean enough with a #define — think EIGEN_ALIGNED_ALLOCATOR_NEW . Ours would, e.g., #define IMPLEMENTS_OUTPUT and would write the virtual function calls.
we could maintain the flexibility of supporting some of the types but not all of the types (by simply listing less of the virtual methods, aka our explicit instantiations). Some methods won’t work for all types out of the box, some methods will never work for all the types (non-differentiable, etc).

The interface to a consumer of the DynamicalSystem class would get a clean interface. I don’t like that the author of a new DynamicalSystem will have to groc this bizarre procedure, but I think it’s pretty readable (probably an argument against the #define). But I think all of our ideas are fairly obtuse in this way. This might be a minimal offender? It feels a little less obtuse than the CRTP approach.

The argument for the #define is more easily adding additional types in the future. (still with a recompile of the library, of course).

BTW - @tkoolen — have you tried the code above yet? I’m don't know if you’ll be allowed to have non-virtual derived classes with methods named exactly the same as the base class.

RussTedrake commented 9 years ago

and if the CRTP example doesn't work with the same name... then I'm not sure if CRTP (at least the simple examples I thought about here https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern ) would work for inheritance deeper than one class?

psiorx commented 9 years ago

I implemented CRTP for inheritance at least 2 levels deep. Check out the code pertaining to FixedAxisOneDofJoint in the drakeJoint pull request. The basic idea should be repeatable for n levels.

https://github.com/RobotLocomotion/drake/pull/1217

RussTedrake commented 9 years ago

ah. i see, so everybody that will be derived from has to implement their own derived property again? seems like a pretty big burden?

On Aug 24, 2015, at 6:56 PM, John Carter notifications@github.com wrote:

I implemented CRTP for inheritance at least 2 levels deep. Check out the code pertaining to FixedAxisOneDofJoint in the drakeJoint pull request. The basic idea should be repeatable for n levels.

1217 https://github.com/RobotLocomotion/drake/pull/1217

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1273#issuecomment-134405087.

RussTedrake commented 9 years ago

A few more thoughts: i hate dealing with explicit instantiations of eigen types. I think many of us have lost many hours on them. The way that we're doing them now (in the cpp file), we effectively have a hidden requirement -- the consumer doesn't know until they try to link whether we happen to have listed their case in the main library or not. I'm really liking the idea that classes list their instantiations directly in the interface (header files).

For header only template libraries this would be bad. But i think we all agree that drake is not that? (RBM is certainly not that, other classes might not be so bad...???).

psiorx commented 9 years ago

I also hate the explicit instantiations. Twan and I were really happy to be able to purge the mess in drakeGradientUtil after moving things into the header. It's not only cleaner but also solves the problem of the library not having a particular use case baked into it.

Maybe there's an intermediate solution that lets us keep the big implementations (doKinematics) separated out while still retaining the benefits of header-only templates.

RussTedrake commented 9 years ago

The problem is that it's very dangerous to mix and match. If you have a class that can be compiled in two slightly different ways by two different obj files that share pointers, that can be super hard to track down. Twan knows that i fear this right down to my core.

On Aug 24, 2015, at 7:43 PM, John Carter notifications@github.com wrote:

I also hate the explicit instantiations. Twan and I were really happy to be able to purge the mess in drakeGradientUtil after moving things into the header. It's not only cleaner but also solves the problem of the library not having a particular use case baked into it.

Maybe there's an intermediate solution that lets us keep the big implementations (doKinematics) separated out while still retaining the benefits of header-only templates.

— Reply to this email directly or view it on GitHub.

RussTedrake commented 9 years ago

another difference : whether or not classes must implement all templated versions of the method

the CRTP has the advantage that missing implementations would be caught at link time
my suggestion is using the vtable, so would be runtime (unless we make the base methods pure virtual.. which I guess accomplishes the link time version).

tkoolen commented 9 years ago

@RussTedrake, I think your new proposal is workable. But I see the CRTP version's lack of a need to determine in advance which types you're going to accept as a big advantage. It has (# of methods * # of different scalar types * # of different eigen types) fewer lines of boilerplate code per DrakeSystem (unless you use the macro thing, which I wouldn't consider clean at all; I'm not proud of that part of the new DrakeJoint). If we decide to support N more scalar types / Eigen types, we have to do zero work with the CRTP version, instead of doing O(N * # of classes in the hierarchy) work. Also, you can put things like input vector size checks in the base class, instead of requiring every class in the hierarchy to reimplement those.

it is precisely doing the explicit instantiations, imho in a cleaner/better way. we were going to have those sitting at the bottom of the class file if we don’t effectively have them here.

That is not what I was proposing with the CRTP version. We need to stop putting the implementation of template classes outside the header file. It is unworkable. It is not what the rest of the world does. Without it there is no need for explicit instantiations.

ah. i see, so everybody that will be derived from has to implement their own derived property again? seems like a pretty big burden?

no. Only the top level class has a Derived& member. All you have to do is call the base class constructor to pass *this up the chain.

i hate dealing with explicit instantiations of eigen types. I think many of us have lost many hours on them

I couldn't agree more.

The problem is that it's very dangerous to mix and match. If you have a class that can be compiled in two slightly different ways by two different obj files that share pointers, that can be super hard to track down. Twan knows that i fear this right down to my core.

But I still don't understand it. The number of instantiations of class methods has absolutely no effect on the size of the object, or how it's laid out in memory. I sent you an Eclipse test project several months ago to demonstrate that. It would be very strange if the most common way of using templated functions/class methods were unsafe in any way. Could you provide a concrete example that exhibits what you're afraid of?

the CRTP has the advantage that missing implementations would be caught at link time

compile time, actually, as long as you stop putting the damn implementations of templated methods outside the header. The virtual methods implementation also catches missing stuff at compile time (at least if you keep the templated output_implementation method in the header, where it should be, otherwise you'd have both the implicit explicit instantiations (i.e. the virtual methods in your version), as well as actual explicit instantiations in the cpp file).

RussTedrake commented 9 years ago

On Aug 24, 2015, at 9:46 PM, Twan Koolen notifications@github.com wrote:

@RussTedrake https://github.com/RussTedrake, I think your new proposal is workable. But I see the CRTP version's lack of a need to determine in advance which types you're going to accept as a big advantage. It has (# of methods * # of different scalar types * # of different eigen types) fewer lines of boilerplate code per DrakeSystem (unless you use the macro thing, which I wouldn't consider clean at all; I'm not proud of that part of the new DrakeJoint). If we decide to support N more scalar types / Eigen types, we have to do zero work with the CRTP version, instead of doing O(N * # of classes in the hierarchy) work. Also, you can put things like input vector size checks in the base class, instead of requiring every class in the hierarchy to reimplement those.

If we decide to put our templated implementations into header files, then I agree with you completely. If we decide they have to live in libraries, then I do not. I think you would need the same number of explicit instantiations as the non-CRTP version needs. it is precisely doing the explicit instantiations, imho in a cleaner/better way. we were going to have those sitting at the bottom of the class file if we don’t effectively have them here.

That is not what I was proposing with the CRTP version. We need to stop putting the implementation of template classes outside the header file. It is unworkable. It is not what the rest of the world does. Without it there is no need for explicit instantiations.

Totally agree that it is unworkable. But on windows it already requires 20 minutes on some machines to compile RigidBodyManipulator.cpp . Would you have that 20 minutes x number of apps (of which we are going to have a LOT)? not to mention the footprint of the resulting files…

I think this is the first big decision we really need to make — can we get all of our templated code into header files? Agreed that’s really the only place where the templates are magical.

ah. i see, so everybody that will be derived from has to implement their own derived property again? seems like a pretty big burden?

no. Only the top level class has a Derived& member. All you have to do is call the base class constructor to pass *this up the chain.

I guess I didn’t mean multiple copies of the property. I was thinking that they had to use the derived.() syntax in their implementations. But it makes sense now that they do not.

i hate dealing with explicit instantiations of eigen types. I think many of us have lost many hours on them

I couldn't agree more.

:) The problem is that it's very dangerous to mix and match. If you have a class that can be compiled in two slightly different ways by two different obj files that share pointers, that can be super hard to track down. Twan knows that i fear this right down to my core.

But I still don't understand it. The number of instantiations of class methods has absolutely no effect on the size of the object, or how it's laid out in memory. I sent you an Eclipse test project several months ago to demonstrate that. It would be very strange if the most common way of using templated functions/class methods were unsafe in any way. Could you provide a concrete example that exhibits what you're afraid of?

I know you did. My (potentially irrational) fear comes from the haunting memory issues we had for about 3 months leading up to the VRC. I tried everything I could to debug it. Turned out it was a #define in a header file — not a template — that was making an object have two different sizes. I thought it was doing that by adding new methods to the file, but perhaps it was adding properties as well.

We didn’t finish the conversation from the thread before. Can you remind me why is it not the case that more methods => bigger vtable => different memory footprint?

the CRTP has the advantage that missing implementations would be caught at link time

compile time, actually, as long as you stop putting the damn implementations of templated methods outside the header. The virtual methods implementation also catches missing stuff at compile time (at least if you keep the templated output_implementation method in the header, where it should be, otherwise you'd have both the implicit explicit instantiations (i.e. the virtual methods in your version), as well as actual explicit instantiations in the cpp file).

Yes, of course. If we move to headers only then I think template magic is definitely the way to go. But I’m not sure we can do that…

Thanks for this. I really want to talk this out. :)

rdeits commented 9 years ago

We didn’t finish the conversation from the thread before. Can you remind me why is it not the case that more methods => bigger vtable => different memory footprint?

I believe there's just one copy of the vtable per class, not one per object, and each object in a given class just stores a pointer to that common vtable.

tkoolen commented 9 years ago

See https://github.com/tkoolen/crtpDemo for a proof of concept using CRTP. Note that there are three levels in the hierarchy and the second level defines one of the methods with the same name as in the first level, without shadowing issues.

tkoolen commented 9 years ago

Thanks for this. I really want to talk this out. :)

Yeah, this is good. Maybe we should have another triage meeting too sometime.

Caveat: there is no requirement for how to implement virtual functions in the standard. But actually, if your class doesn't define any virtual methods, then it doesn't even need a vtable. A vtable is a lookup table for virtual functions only. Virtual functions cannot be templated, otherwise we wouldn't be having this discussion.

And Robin is right. Sane compilers will add a vtable pointer to a class that needs it, not a copy of the vtable.

See http://www.cs.technion.ac.il/users/yechiel/c++-faq/dyn-binding.html

tkoolen commented 9 years ago

About naming the implementation methods of the derived class the same as the interface methods: OK that's not a good idea. If you forget to implement a method in a derived class, the program compiles but you get an infinite loop if you try to call the method :-) https://stackoverflow.com/questions/14625801/why-are-crtp-impementation-and-interface-methods-named-differently

RussTedrake commented 9 years ago

Re: One vtable per class.

So let's play out the multi-library scenario. Library A compiles class Foo with 2 template instantiations. Library B compiles class Foo with 3 template instantiations. Library A instantiates Foo and passes the pointer to Library B. Library B's copy of the object will still point to Library A version's vtable... correct? It's a little scary thinking through all of the cases like copy constructors and assignment operators across the libraries, but maybe it all works out ok.

Re: crtp demo

Great to see that, thanks. That's what I was saying -- you have to make the implementations private to avoid the shadowing. :)

CRTP vs explicit type listing (in the header)

I think if we have a header-only library, then the CRTP would allow us to add types (and handle the massive permutations of Eigen types) much more seemlessly. I still think that there is a serious cost to the template trickery in code readability and ease of use. I agree that it's simple enough once you understand it, but there is definitely a start-up cost.
If we have a cpp-based library, then both approaches need to explicitly list all types. This incurs the number of classes X number of types X number of methods cost in both models. In my (current) view, listing them in the header is better because it avoids the need for the template machinery and makes the code much more readable. I think it would also make the library interface that we provide to the world more usable.
I don't see how we can be a header-only library. We have at least a few pieces of code with huge footprints (e.g. RigidBodyManipulator), and we are going to have many many entry points (if swig comes through, possibly even one per method + public property x number of classes, + all of our stand-alone "apps"). I think using header only templates for non-class functions like in drakeGradientUtil is definitely the way to go.

RussTedrake commented 9 years ago

for reference, on this machine, libdrakeRBM is ~ 1MB (libdrakeIK is ~2MB) and libdrakeMexUtil is ~55k.

avalenzu commented 9 years ago

My (potentially irrational) fear comes from the haunting memory issues we had for about 3 months leading up to the VRC. I tried everything I could to debug it. Turned out it was a #define in a header file — not a template — that was making an object have two different sizes. I thought it was doing that by adding new methods to the file, but perhaps it was adding properties as well.

Just dug back into the history of those dark days before the VRC. The offender was our original attempt to add Bullet support to drake (7c1bc30bef0241c861b0d73ba2749506b99f764f). It definitely adds member variables to the class based on a #define. Here's the hunk from RigidBodyManipulator.h:

#ifdef BULLET_COLLISION
  btDefaultCollisionConfiguration bt_collision_configuration;
  btCollisionDispatcher bt_collision_dispatcher;
  btDbvtBroadphase bt_collision_broadphase;

public:
  btCollisionWorld bt_collision_world;
#endif

tkoolen commented 9 years ago

well there's your problem. The #define adds member variables to the class, thus changing the size of every instance of the class. This is not possible using templates.

tkoolen commented 9 years ago

Re: multi-library scenario. But the number of instantiations of templated methods has no effect on any vtable since the vtable contains virtual method pointers and virtual methods cannot be templated.

I don't see how we can be a header-only library. We have at least a few pieces of code with huge footprints (e.g. RigidBodyManipulator) [...]

Moving RigidBodyManipulator methods to the header and getting rid of the explicit instantiations will make it so that you only compile the instantiations that you actually need. Does anybody ever go through the list of explicit instantiations to prune out those we're not using anymore (and are hence compiling for no reason)?

RussTedrake commented 9 years ago

It's true -- not only do we not prune but we have to add every case we ever touch because it's so annoying to hit a template argument that's not in the library.

But i think that the parts we use are still very big. No? And again, we're talking about LOTS of entry points.

tkoolen commented 9 years ago

Say you need to use N instantiations of a method. In my view, either you have (at least) N explicit instantiations, or you have zero if you put the code in the header. In either case, you'll need to compile at least N instantiations (explicit or otherwise), because you need them. The only difference is that in the header only case you're telling the compiler to go figure it out, whereas in the explicit instantiation case you're burdening the programmer with it. The size of any parts we use is irrelevant, since (at least) the same amount of work needs to be done by the compiler in the case of explicit instantiations.

RussTedrake commented 9 years ago

Let's think out the idea of having a (primarily) header-based library -- I'm definitely willing to consider it:

i think compile time would go up. Not sure if it would go WAY up.
compile time during development will also go up -- if you're working on an algorithm it will have to get recompiled in many places for every change (not just relinked)
i think we would probably want to compile libraries for the mex interfaces (to support having many small entry points). Swig/etc might help with that -- eg by autogenerating the explicit instantiations.
linking issues would get easier :). ( minimal need for dllexports, rpaths, etc )

What else is going to change?

avalenzu commented 9 years ago

Just to make sure I'm understanding you correctly, @RussTedrake, you think compile time would go up because we'd be compiling the same instantiations for multiple executables?

RussTedrake commented 9 years ago

@tkoolen - your argument about N isn't acknowledging that the compiler has to redo the work for every entry point. In the library case the work is done only once per library

psiorx commented 9 years ago

Compile time would go up, at least in part, due to the fact that the compiler has to do more work to infer the usages of a particular library instead of being told up front. It's unclear how much it would increase but my guess is it would be marginal.

Removing the headache of explicit instantiations and adding the ability to generate only the necessary template expansions dynamically and automatically outweighs any compile time and code size counterarguments imho.

patmarion commented 9 years ago

My own preference is toward APIs like the one Russ posted in the Multipler / TimesThree example code. I've got no problem with templates for internal implementations, but I'd prefer to not design core public API around template patterns like the CRTP. I think that's asking a lot for users of your library to grok. The API like Russ showed helps steer users toward the preferred types and makes their code simple and readable. Then again, you guys are trying to do some pretty fancy stuff, so maybe these template patterns are just want to want. Also consider language wrappers, and how you want Python and Matlab to map to the C++ api.

tkoolen commented 9 years ago

@tkoolen - your argument about N isn't acknowledging that the compiler has to redo the work for every entry point. In the library case the work is done only once per library

Point taken.

Compile time would go up, at least in part, due to the fact that the compiler has to do more work to infer the usages of a particular library instead of being told up front. It's unclear how much it would increase but my guess is it would be marginal.

I would guess that figuring out which instantiation is required implicitly from the usage is no more difficult for the compiler than parsing an explicit instantiation since it has to keep track of all the arguments with which the templated function is called anyway. But Russ does have a valid point that template instantiations will have to be compiled more than once when requested from multiple compilation units.

Removing the headache of explicit instantiations and adding the ability to generate only the necessary template expansions dynamically and automatically outweighs any compile time and code size counterarguments imho.

I completely agree. If we weigh not having to compile unused, old explicit instantiations against compiling the same template more than once from different compilation units, I think that compile time will still likely go up a bit. But the convenience for the programmer simply outweighs it for me.

Now explicit instantiations may make sense for some particular use case (e.g. to reduce compilation time if something really is static over a long period of time). But we should do that when compilation time becomes a problem. What I'm trying to do is change our default behavior and take away any fear of having template definitions in the header files.

tkoolen commented 9 years ago

@patmarion, those are valid arguments. My personal preference is to not have a bunch of slight variations for every method in your main base class, but I understand your point of view as well. As for the language wrappers, I think that should be OK. @rdeits, correct me if I'm wrong, but I believe you can tell Swig what instantiations to wrap. Also my little mexify thingy could handle that.

psiorx commented 9 years ago

Taken from the Google C++ Style Guide (https://google-styleguide.googlecode.com/svn/trunk/cppguide.html)

Avoid complicated template programming.

If that fails:

If a template or inline function is declared in a .h file, define it in that same file.

RussTedrake commented 9 years ago

fwiw - i asked a few people today at the meeting i'm attending at the NSF. caveat: they only got minimal context from a hallway conversation. but most people who seem to know say they try to stay away from templates except for very simple small interface classes.

In my view, the main argument for templates is the fact that Eigen forces a huge number of types on us.

I'm still torn, but am enjoying the debate.

@avalenzu -- yes - my statement about compile time is completely about compiling the same code many times for different entry points. I don't think the work that the compiler has to do for any one entry point would change in any noticeable way.

tkoolen commented 9 years ago

My current stance: I still have a preference for the CRTP version of DrakeSystem because it requires less boilerplate code per DrakeSystem subclass and is more flexible (and it results in higher performance, but that's a minor issue). In the virtual methods design you still need a templated method to avoid having to write your dynamics function five times, so a drake user will still need some understanding of templates. But now that new drake user is going to wonder why he also needs five additional virtual dynamics methods. It's just not as clean as it can be.

However, I don't think it's the end of the world if we go with the virtual methods version if we do consider it easier to understand, as long as we put the implementations in template methods as in your Multiplier/TimesThree example.

Separate from the DrakeSystem design debate, I do really want to change our default behavior when it comes to explicit instantiations of templated methods. If we decide to use templated methods / functions somewhere, then the default behavior should be to put the implementation in the header. If you're not convinced of that one, please let me know, because I really want to work it out.

RussTedrake commented 9 years ago

Thanks @tkoolen . I actually think that the virtual methods version is more flexible in some ways, too -- a derived class can implement a subset of the argument types (e.g. just Eigen::VectorXd inputs). It could even overload just one virtual function directly, it's only if people want the additional flexibility of handling many types when they would need to implement their own local template. I still think that we may have some systems that cannot support, e.g. autodiff or polynomials -- in CRTP they would have to explicitly error in their method implementation for all of the types they do NOT support, or suffer the compile errors. Am I correct about that? Also, most derived systems implement either dynamics or update, but according to your infinite loop report, we would require them to author both for every class.

Re: no more explicit instantiations. I agree with the philosophy, but don't think it's quite that simple. The doKinematics and the dynamics methods, for instance, are templated. If we agree that code size/compile time is an issue, then we should move them into the .cpp files. If they get called from a typed virtual function in a header file, then that would be sufficient to cause the compiler to instantiate all of the relevant type versions into the library at compile time (so at least there are no bizarre explicit instantiation lines at the bottom of cpp files). I do agree that we should move templated method implementations into header files in lots of cases, just probably not every case.

My current stance: Though I'm finding myself arguing for the virtual method version, I still see the virtues of CRTP. The biggest argument FOR it, in my view, is the ridiculous number of Eigen argument templates that we might need (for ::Block, ::Map, ::Ref, ...).

A proposal: I will flesh out the simplest systems in my dev branch with the virtual method approach (since I had already started that before this conversation took off), and try to get a sense for just how bad it might be. We'll review together and go from there.

More thoughts/comments always welcome.

tkoolen commented 9 years ago

I still think that we may have some systems that cannot support, e.g. autodiff or polynomials -- in CRTP they would have to explicitly error in their method implementation for all of the types they do NOT support, or suffer the compile errors. Am I correct about that?

But you can make the compile error very informative. Consider putting this at the beginning of Pendulum::dynamics_impl:

static_assert(std::is_floating_point<typename DerivedX::Scalar>::value, "Method only accepts floating point value types");

or if you want to be even more strict:

static_assert(std::is_same<typename DerivedX::Scalar, double>::value, "Method only accepts double");

So if you pass in something other than a vector of doubles, you'll get a nice message at compile time. I'm not saying that all compiler errors related to templates are going to be super easy to understand for beginners (although it's much better than it used to be, especially on clang), but constraints like this can be made very clear.

If you prefer a runtime error over a compile time error, you can implement that too, but why would you want to?

On the other hand, in the virtual methods version, you would still need to implement every variant of dynamics (otherwise you'll get a compiler error saying that you can't instantiate an abstract class). So you're going to have to throw a runtime error in every one of the variants of the method that you don't support.

If for some reason you want dynamics_impl to handle e.g. VectorXd<TrigPoly> inputs differently, then you can just add a template specialization for that.

I do agree that we should move templated method implementations into header files in lots of cases, just probably not every case.

Yup, there may be cases where it is better to have explicit instantiations, I just want to change the default behavior, and so do you it seems.

If we agree that code size/compile time is an issue, then we should move them into the .cpp files

Exactly, if we see that it's an issue, we can deviate from the hopefully new default behavior.

A proposal: I will flesh out the simplest systems in my dev branch with the virtual method approach (since I had already started that before this conversation took off), and try to get a sense for just how bad it might be. We'll review together and go from there.

sounds good.

BTW, as I mentioned before, in the virtual methods implementation, handling Map<...>, Block<...> etc can all be done with Ref. I would propose this:

class DrakeSystem {
public:
  typedef Matrix<TrigPoly, Dynamic, 1> VectorXTrig;
  virtual VectorXd dynamics(double t, const Ref<VectorXd>& x, const Ref<VectorXd>& u);
  virtual VectorXTrig dynamics(double t, const Ref<VectorXTrig>& x, const Ref<VectorXTrig>& u);
};

It's not as efficient as templating, but it's not terrible and limits you to a smaller number of different variants of each method (double, AutoDiffScalar<VectorXd>, TrigPoly, Polynomial).

RussTedrake commented 9 years ago

I'm not sure I really understand the implications of Ref. If I understand the documentation correctly, t looks like the overhead should be minimal (zero? if it's templated away) if the Matrix passed in has its memory allocated contiguously?

avalenzu commented 9 years ago

I actually think that the virtual methods version is more flexible in some ways, too -- a derived class can implement a subset of the argument types (e.g. just Eigen::VectorXd inputs). It could even overload just one virtual function directly

Won't this cause silent errors if one tries to call a method with an input type that's supported by the parent class and not by the derived class? For example, if I change the TimesThree class to be

class TimesThree : public Multiplier {
public:
  virtual double  output(double a)  { return output_implementation(a); };

private: // so that it does not "hide" the base class implementation
  double output_implementation(double a) {
    cout << "TimesThree " << "double" << endl;
    return 3*a;
  }
};

in your example from above, I get the following output:

Multiplier i                                                                                                                                                     
Multiplier d                                                                                                                                                     
Multiplier i                                                                                                                                                     
TimesThree double

Note that in the third line the output method from Multiplier is called.

RussTedrake commented 9 years ago

@avalenzu -- just like in DrakeSystem.m, my base class will not be pure virtual, but will throw a runtime error saying "you must implement this method". @tkoolen also said we'd have to implement every case, but that's not true.

RussTedrake commented 9 years ago

@avalenzu -- thinking about it more, i see your point. I don't think it's a problem near the base class, but it could be a problem deeper into the tree.

What behavior do we want in an ideal world? A - implement only the types you want (possibly via templates) and fall back to the parent class for the missing types B - implement only the types you want and error immediately (compile time is great, runtime if we must) C - implement all types by default, manually exclude any types you don't support (eg with template specilaization)

tkoolen commented 9 years ago

ah yes, if you have the virtual method variants call the templated implementation, you can do the same static_assert and template specialization stuff in the templated method, resulting in compile time errors. I think that's highly preferable over throwing runtime_errors in the derived class methods, and especially preferable over throwing errors by default in DrakeSystem, because that'll make you forget to implement methods in the derived classes. So we're back to the virtual methods version just having more boilerplate.

tkoolen commented 9 years ago

I'm not sure I really understand the implications of Ref. If I understand the documentation correctly, t looks like the overhead should be minimal (zero? if it's templated away) if the Matrix passed in has its memory allocated contiguously?

No, http://eigen.tuxfamily.org/dox-devel/classEigen_1_1Ref.html says the opposite about an example function foo3 using Ref directly:

The downside here is that the function foo3 might be significantly slower than foo1 because it won't be able to exploit vectorization, and will involve more expensive address computations even if the input is contiguously stored in memory.

However, they do also say that

To overcome this issue, one might propose to overload internally calling a template function

which is what we're planning to do, so I guess it's not a problem.

tkoolen commented 9 years ago

Hey, how's this idea (actually already implemented in DrakeJoint).

We have a non-templated top level DrakeSystem interface with all the virtual method variants, then just below that in the hierarchy DrakeSystemImpl (or somthing like that) which is a CRTP base class, calling templated implementations in the derived class member. That way:

we only have to implement the virtual methods once, making the boilerplate go away
the top level interface still 'helps steer users toward the preferred types' as Pat put it.

patmarion commented 9 years ago

@tkoolen the downside comment you quoted only applies to Ref objects with certain stride options, if I'm not mistaken. In the default case, the Ref requires contiguous memory, and does exploit contiguous memory vectorization, but then requires a memory copy for non-contiguous memory data, and therefore doesn't work for non-readony arguments (non const)

tkoolen commented 9 years ago

Yeah, but regardless of stride options it 'will involve more expensive address computations even if the input is contiguously stored in memory.' But really, no biggie. Remember I'm the person recommending the use of Ref in this case.

patmarion commented 9 years ago

I think you are misinterpreting what they are saying. They are saying that by specifying the InnerStride option, it will be slower, even for contiguous memory. That's not saying it will always be slower.

The downside here is that the function foo3 might be significantly slower than foo1

tkoolen commented 9 years ago

This is getting the real discussion off track, but I'll reply one more time to the ref stuff.

Whether or not the type has a contiguous memory layout is directly determined by the stride options. ~~Ref inherits from Map~~ The implementation RefBase inherits from MapBase. From the Map documentation (http://eigen.tuxfamily.org/dox/classEigen_1_1Map.html):

StrideType: optionally specifies strides. By default, Map assumes the memory layout of an ordinary, contiguous array. This can be overridden by specifying strides. [...]

But the Ref documentation states that the function "will involve more expensive address computations even if the input is contiguously stored in memory", i.e. has the default stride options.

RobotLocomotion / drake