dartsim / dart

DART: Dynamic Animation and Robotics Toolkit
http://dartsim.github.io/
BSD 2-Clause "Simplified" License
889 stars 286 forks source link

Shapes are not properly cloned #896

Open costashatz opened 7 years ago

costashatz commented 7 years ago

Hello,

I have a minor (but critical to my applications) issue regarding the clone function of Skeletons. So, when I do:

auto skel_clone = my_skeleton->clone();

I was expecting that everything (including the visualization and collision shapes) are cloned (i.e. copied), but it seems that this is not the case. This is very easily replicated with something like the following:

// includes and code

namespace {
    dart::dynamics::SkeletonPtr global_robot;
}

void load_global_robot() {
// code to load/create the global skeleton
}

void add_damage(const dart::dynamics::SkeletonPtr& skel){
    std::string leg_bd_name = "my_damaged_body_name";
    auto bd = skel->getBodyNode(leg_bd_name);
    // set half the mass
    bd->setMass(bd->getMass() / 2.0);
    auto nodes = bd->getShapeNodes();

    for (auto node : nodes) {
        Eigen::Vector3d tr = node->getRelativeTranslation();
        tr(1) = tr(1) / 2.0;
        node->setRelativeTranslation(tr);
        // here's the problem -- the pointers 's' are always the same
        auto s = node->getShape();
        if (s->getType() == "BoxShape") {
            auto b = (dart::dynamics::BoxShape*)s.get();
            Eigen::Vector3d size = b->getSize();
            size(2) = size(2) / 2.0;
            b->setSize(size);
        }
        else if (s->getType() == "CylinderShape") {
            auto b = (dart::dynamics::CylinderShape*)s.get();
            b->setHeight(b->getHeight() / 2.0);
        }
    }
}

int main(int argc, char** argv)
{
    auto my_new_skel = global::global_robot->clone();
    add_damage(my_new_skel);
    // if now we print the change shapes (in our case, lengths or size) for the global::global_robot, they will have the changed values and not the original values as they should
    return 0;
}

I know my use-case is not the general use-case (i.e., usually we do not intentionally break our robots) and I have found ways to by-pass it, I think it should be easy for you to do in DART.

Many thanks in advance, Konstantinos

mxgrey commented 7 years ago

I believe Shapes are shallow copied (rather than deep copied) when a cloning is performed. This was mostly to avoid the considerable cost of deep copying large geometric structures like meshes. (But also, this decision deferred the need to implement clone functions for each Shape type, which I reluctantly admit may have played a role in the decision.)

There is an ongoing discussion about how to handle the choice between a deep copy and a shallow copy. I think the best case scenario would be a copy-on-write for all properties where creating a clone will only do a shallow copy of everything, but any call to a setXXX(~) function will automatically create a deep copy for whichever property field the setter function is affecting.

The holdup with that is @jslee02 has proposed the ability to define master-slave relationships between clone properties where changing a property in a master clone will also change the properties in any slave clones that are linked to it.

While I definitely like that idea, it's not immediately obvious to me what the most sensible way to implement the interface for that would be. Should it be a setting at the Skeleton level? Should it be a setting for each entity (e.g. Joint, BodyNode, ShapeNode, etc) within the Skeleton? Should each setter function offer a flag that allows the user to override whatever the current setting happens to be?

I could imagine an enumeration like:

enum CloneFlag
{
  OBEY_SKELETON, // Tell the object to use the flag of whatever Skeleton it is in
  OBEY_BODYNODE, // Tell the Node (e.g. ShapeNode) to use the flag of whatever BodyNode it's attached to
  COPY_ON_WRITE,
  DO_NOT_COPY
};

Please let me know if you have any thoughts on this.

jslee02 commented 7 years ago

I implemented Shapes to do shallow copy for the reason @mxgrey said, but I also didn't consider the case that the Shape properties can be changed. Now I believe the properties of an object should be always (logically) copied when it's cloned rather than sharing the property values across clones after cloning for that reason. (Honestly, I cannot recall myself in what context I proposed the master-slave relationships. 😓)

In this sense, I like the idea of copy-on-write mechanism to avoid unnecessary copies when the properties are not changed in the clones. So the properties are (physically) copied only when they are changed. Of course, there would be an overhead to check when the deep copy is necessary, but I believe it would be reasonably negligible compared to the advantages.

In addition to copy-on-write, we could add the master-slave relationship between clones, but let's not to consider it until we find a compelling reason why we need that.

mxgrey commented 7 years ago

One use case for the master-slave would be if you have a bunch of simulations running in parallel and you want to change some model parameter across all instances for a new batch of simulations. If you did a simple copy-on-write to change the parameter, then every single simulation instance (minus one) would end up creating its own copy of the properties as you modify each one of them, even though you ultimately want them all to share their properties at the end.

jslee02 commented 7 years ago

Thanks for taking an use case for me. That makes sense. If we would like to introduce master-slave, then I see several possible states of a property.

By explicitly distinguishing the state, we might be able to get a better idea how to efficiently implement concurrency for getting and setting those properties.

costashatz commented 7 years ago

@mxgrey and @jslee02 thanks for the ultra fast responses.

I like the master-slave idea and the copy-on-write, copy-now functionalities and of course it would be a great addition to the library. Nevertheless, in all of my use-cases a functionality like the following is more than enough:

In short, I do not really need the copy-on-write and the master-slave functionalities and although they do exist use-cases where they might come in handy, I would say that they are not that often. And I am saying that even though my regular use case is to have multiple simulated robots (in different worlds with changing properties) running in parallel.

mxgrey commented 7 years ago

I've talked to at least one person who chose not to use DART because of the fact that we mostly do deep copying of everything (except for shapes), and they felt this is wasteful for cases where very many simulations are being run in parallel on the same model. Copy-on-write is very desirable because it would provide the best of both worlds where there are very few wasted resources when using the same model in many different threads at once, and it takes no effort to modify one instance of the model without affecting the other models or wasting any additional resources.

I believe we can implement copy-on-write without changing any of the API for DART, since property storage is taken care of in Aspects, hidden away by setters and getters. I also believe it will be easy to implement the master-slave relationships; the only real challenge is designing how those relationships should be defined. And even that is really just a matter of figuring out what kind of API would make the most sense to end users.

mxgrey commented 6 years ago

I had a thought related to this discussion that I figured I'd share to see if anyone likes it.

We could offer two functions: SkeletonPtr Skeleton::clone() and SkeletonPtr Skeleton::parallelize().

The clone() function would create a new Skeleton with an independent state, and whose properties are shallow-copied, but perform copy-on-write.

The parallelize() function would create a new Skeleton with an independent state, but the properties are shallow copied. Any changes to the properties of a Skeleton will automatically affect all Skeletons that were "parallelized" from it (recursively, so it will also affect Skeletons that were parallelized from any parallelized Skeletons). The states of all the parallelized Skeletons will remain independent.

The only thing that would be tricky is dealing with the addition and subtraction of kinematic structure, like adding and removing BodyNodes. Having a reliable way to update that between various parallelized Skeletons will probably require a considerable redesign of the Skeleton and Node classes, but I think it would be worthwhile.

psigen commented 6 years ago

The copy-on-write semantic sounds like a great idea to make copying more efficient without changing the API or requiring anything from the user.


On the other hand, I'd just like to point out that no one has actually requested "parallelize()" functionality yet...

I'd prefer if we first thought a bit about the API that might manage such a complicated system. Since this seems like basically a form of instancing (a performance technique common throughout 3D rendering communities -- look up "mesh instancing", "geometry instancing", or "material instancing" for examples), we may want to look there for intuitive naming and semantics conventions.