opensim-org / opensim-core

SimTK OpenSim C++ libraries and command-line applications, and Java/Python wrapping.
https://opensim.stanford.edu
Apache License 2.0
778 stars 311 forks source link

finalizeConnection() failing when external force is added to model #3204

Open itbellix opened 2 years ago

itbellix commented 2 years ago

Hi, I am working with the OpenSim API in Matlab R2021b (on a Windows 10 machine) and I am trying to integrate an external force directly into my model - arm26.osim. At first, I tried to use the ExternalLoads class as in this comment but it did not really work (I had the feeling that the external load, that I was using just as a sort of wrapper for the external force to be applied) was kind of corrupting the data source for the force values. So I switched to directly applying an ExternalForce to my model, as in the following code snippet:

% external_force variable is created above, together with external_force.mot 
% storing the time history of the force and the application point

model.addForce(external_force);
model.finalizeConnections();

When running this, I get alternatively one of the following errors:

Java exception occurred:
java.lang.RuntimeException: bad allocation

    at org.opensim.modeling.opensimSimulationJNI.Model_finalizeConnections__SWIG_0(Native Method)

    at org.opensim.modeling.Model.finalizeConnections(Model.java:830)

or

Java exception occurred:
java.lang.RuntimeException: ExternalForce: Data source external_force specified by name, but was set.

    at org.opensim.modeling.opensimSimulationJNI.Model_finalizeConnections__SWIG_0(Native Method)

    at org.opensim.modeling.Model.finalizeConnections(Model.java:830)

To solve this, I needed to modify my code: basically, after adding the force to the model, I set again its data source, and the errors disappear:

% external_force variable is created above, together with external_force.mot 
% storing the time history of the force and the application point
model.addForce(external_force);

external_force_storage = Storage('external_force.mot', false);       % loading the external force data in a storage object
force_in_model = model.getForceSet().get(6);                               % getting the force from the model (6 is the right index)
ExternalForce.safeDownCast(force_in_model).setDataSource(external_force_storage);   % resetting the data source to be the one I expect

model.finalizeConnections();

I would think that somehow the information (pointers?) regarding which data source to use gets lost when the force is loaded into the model...

Does this appear plausible?

aymanhab commented 2 years ago

This seems plausible, while possibly a bug. Generally there're two mechanisms for populating the model tree of components:

  1. XML: in which case all connections are based on names/paths that are serialized.
  2. Creating the objects programmatically and carefully construct the tree to make sure connections are not lost along the way.
  3. You're trying to mix and match by loading the model then adding a Force to it. In theory it should work but it appears there's unintneded interaction between 1&2 that would be good to sort out. If you can attach the full set of files and script needed to reproduce it would be great. Thank you
itbellix commented 2 years ago

Dear @aymanhab, thank you very much for your answer. Indeed, I am using case 3, but today I was trying to reproduce the issue and it did not pop up. I am not sure if that could be caused just by the fact that I restarted Matlab (and my laptopt as well) and somehow the issue was given by a not-really-clean workspace yesterday. In case the issue arises again, I will attach here what is needed to reproduce it. Thank you!

itbellix commented 2 years ago

Dear @aymanhab, I think I have some interesting updates: it seems to me that the errors that I reported are thrown when calling model.initSystem() (after an ExternalForce is applied to the model) mostly when a breakpoint is inserted and the code flow is stopped for some seconds at least. I cannot really make sense of it, but I attach here a zipped folder with the content of which I am able to reproduce the error kind of consistently. The script to run is test_withArm26.m - I guess the relevant part of the code is from line 40 onwards - and I also include some other functions that are used to read the .trc file and generate the force that I want to apply to the model.

I did some experiments, and I report my results in the table below. In particular, I analyze the case in which I inserted a breakpoint at line 61, and the case in which no breakpoint was inserted. For the two cases, I divide the results in two subcases: when line 45 to 49 are commented, and when they are not. I report the number of times that I ran the code, along with the number of errors that I saw, and it looks like the presence of the breakpoint is causing the problem:

commented lines 45-49 using lines 45-49
trials errors trials errors
no breakpoint 25 0 25 0
breakpoint line 61 23 13 25 0

I would like to stress the fact that, even if it seems a bit unreasonable, when the breakpoint is inserted, if I remove very quickly the breakpoint and hit on "continue" the error almost never arises...

Please let me know if you are able to reproduce this, and in case how I can help more. Anyway, it seems to me that a temporary solution is to "reset" the data source for the ExternalForce once it is applied on the model (what I am doing in lines 45-49). However, I am pretty sure that I had at least once or twice the error even when those lines where inserted, when running a more complex code that I did not include here but follows the same steps as test_withArm26.m

cc @aseth1

UPDATE: I tried to use a pause() command in my code instead of inserting a breakpoint manually, so now my code (at line 61 onwards of test_withArm26.m onwards) looks like:

for i = 1:muscles.getSize()
   % Downcast base muscle to Thelen2003Muscle
   pause(x);
   muscle = Thelen2003Muscle.safeDownCast(muscles.get(i-1)); 
   Fmax(i) = muscle.get_max_isometric_force();
end

If x is big enough (like 1 second or more), I still see most of the times the errors I am reporting. When I set x=0.5 or lower, I think the situation is better, but still the errors are thrown every once in a while

fcanderson commented 2 years ago

I have not been working in OpenSim for a long time now, but, for what it is worth, the intermittent, time-dependent errors you are getting look to me like it might be related to how memory is being managed. I really don't have experience with Matlab, but when writing Python bindings, memory management issues can be tricky. There are basically two options: 1) create copies of objects that are moved across the language boundary (C++ to Python, etc.) or 2) move a reference across the boundary (no copy is made). The first option is relatively safe because deleting the copy does not affect the C++ side (although having a copy may not work if changes need to be seen on the C++ side). The second option is more efficient, but if an object is deleted (or garbage collected) prematurely, a crash usually occurs.

The behavior you are seeing @itbellix makes me think some garbage collection is occurring when it shouldn't be. If you access the objects quickly (before garbage collection has had a chance to occur), a crash is avoided. However, if you wait a longer time, and give the garbage collection system time to act, then an error occur if an object has been prematurely deleted. The behavior is not entirely predictable, because sometimes the garbage collection can happen quickly and sometimes it can happen more slowly.

I could be totally off-base here. @aymanhab will likely be able to confirm whether or not premature garbage collection could be a possibility.

For what it's worth.

aymanhab commented 2 years ago

Thanks for chiming in @fcanderson Indeed this is the most common cause of sporadic crashes. In this specific case it's more complicated because the Storage where the data is retrieved from is not maintained by the ExternalForce object so it can easily get deleted/garbage-collected from underneath the force causing the sporadic crashes. I checked our codebase and the test case (link below) maintains the Storage separately. https://github.com/opensim-org/opensim-core/blob/d2b0c42642a4619ac36f30f7f36267ebaa9c17c2/OpenSim/Simulation/Test/testForces.cpp#L1697 You may try to create/keep the Storage alive in the main function where the initSystem is called to check if that fixes the problem, in the mean time will leave this issue open until a fix is found/verified.

itbellix commented 2 years ago

Dear @fcanderson and @aymanhab, thank you for your answers. I think it is indeed a memory management problem, since I am creating the ExternalForce in a separate function (hence, a different workspace), and then trying to use it in my main workspace. I think that the implementation of how the Storage object is dealt with (in ExternalForce.h(line 250) could be the cause of it, since once the ExternalForce object is returned by a function it kind of looses the connection with its storage (and that's why if I reset such a connection it works).

To conclude, I was using the API incorrectly, however I do think that updating the documentation (by mentioning maybe this problem) could help other users to avoid running into the same error.

mrrezaie commented 8 months ago

Hi, I'm using Python and have similar issue. I'm trying to add external loads (GRF) to the model (model.getForceSet().cloneAndAppend), followed by further API usage in which model.initSystem() should be re-called. This is what happens:

  1. If I write all codes consecutively, it works well.
  2. If I put the externalForce sections into a function in either the same file or a separate file and run the entire code, this error would be raised:

File F:\Python311\Lib\site-packages\opensim\simulation.py:27206 in initSystem return _simulation.Model_initSystem(self) RuntimeError: std::exception in 'SimTK::State & OpenSim::Model::initSystem()': ExternalForce: Data source subject11_trial03_grf specified by name, but was set.

Please notice the empty name (double space) after "but". It seems that the model no longer has access to the data source Storage.

OpenSim v.4.5

Thank you.