robotology-legacy / mex-wholebodymodel

Matlab MEX interface to the iWholeBodyModel interface.
11 stars 9 forks source link

Wbm class #62

Closed naveenoid closed 7 years ago

naveenoid commented 8 years ago

To track the merging of the WBM class.


This change is Reviewable

Ganimed commented 8 years ago

Hello naveenoid. Is everything alright? Gave the team or the administrator an OK for merging? Or is it already merged?

gabrielenava commented 8 years ago

Today @Ganimed comes into the lab and we tested the WBM-class after merging it in a temporary branch, WBM-test. Apparently, there are no conflicts with the balancing stuff, I also run the ctest and it passed. I think we can merge the WBM-class into master, what do you think @traversaro and @naveenoid ?

gabrielenava commented 8 years ago

Although it has to be specified that I cannot test the WBM-class visualizer since some matlab functions used in it don't exist in matlab 2012b. However, this visualizer is used only in the WBM-class while the balancing code uses the @DanielePucci 's visualizer

traversaro commented 8 years ago

@Ganimed @gabrielenava if the ctest are passing, I am ok to merge the code... perhaps just add a note on the README (or where you think it is useful) to explain what the WBM-class is and its scope/compatibility (mention for example on which version of MATLAB works).

gabrielenava commented 8 years ago

Ok, maybe it is better when we are waiting a little bit until @Ganimed has written a small readme document. He also told to me that it is maybe better when we are merging when he has written the full documentation for it.

Ganimed commented 8 years ago

Hi, over the weekend I will recheck everything. Just to be sure that I have nothing forgotten or overseen an error. When I have written the small readme-file in the next days, we can merge the code. But I want to keep the WBM-Class branch as a working repository. So when I make some updates, changes or additions in the future, it would be easier me and would not interfere with the master branch.

Ganimed commented 8 years ago

P.S. should the readme file written in Markdown-format?

traversaro commented 8 years ago

Yes, Markdown is the easier choice for Readme

Ganimed commented 8 years ago

I've found some hidden design errors in the structure which I have now corrected. Additionally I added now also a dark scene mode for the visualization with a black background. This is better for the eyes. Now only the readme-file is missing. When I have uploaded it (I will announce it when), we can do a merge.

@gabrielenava later when I have some time left, I want to integrate the visualization of the thrust forces. But for that I would then need some explanations to understand what is going on in these two additional if-conditions which are added to the visualizeForwardDynamics-function. Maybe let's do that in June or in summer.

Another question: Where can I find the ctest script-file and how can I call it to verify the newest changes? Thanks.

traversaro commented 8 years ago

@Ganimed to run the test on the repository, feel the instructions available at https://github.com/robotology/mex-wholebodymodel#tests . To understand the logic behind it, check this line of the CMakeLists.txt file in the tests directory: https://github.com/robotology/mex-wholebodymodel/blob/master/tests/CMakeLists.txt#L25

add_test(NAME matlab_mex_wholebodymodel_tests COMMAND ${Matlab_MAIN_PROGRAM} -nodesktop -nojvm -nodisplay -r "runWBMTests")

The test is then just launching the runWBMTests scripts, that has a logic to catch errors in Matlab, and exit with 0 if everything works fine or with 1 if there is an error.

Ganimed commented 8 years ago

Thank you! I will do it as soon as possible. For the documentation of the definition of the joint names in the sourcecode, I have found the wiki-page for the iCub Model Naming Convention (http://wiki.icub.org/wiki/ICub_Model_naming_conventions) as source. There is everything is described what is needed. But following joint names, namely

are not in the naming convention listed. So where I can find their naming definition or their declaration?

Ganimed commented 8 years ago

Correction: I recognised, that the name list in the code contains link names and not joint names. But anyway, the question is still the same.

traversaro commented 8 years ago

"COM" is a magic name for identifing the center of mass, not an actual frame. l_gripper and r_gripper are aliases for l_hand_dh_frame and 'r_hand_dh_frame. I need to check whatneck_1` , but it should be something related to the neck.

Ganimed commented 8 years ago

Oh, that's good to know. At the moment I'm making a short description of the link names which are used in the visualizeForwardDynamics-function. And there I want also to give the source of the convention. But these few names confused me, since they are not listed in the naming convention. Thank's.

Ganimed commented 8 years ago

Maybe "neck_1" means the parent joint of "neck_pitch"?

traversaro commented 8 years ago

neck_1 is not a joint, but it is a link. Anyhow don't be too worried of this details, I think it should not be a big problem (even because I don't think you use that frame at all!).

Ganimed commented 8 years ago

Hi all!

Before everybody is going in vacation, I want to inform you that the WBM-Class has made a real big leap. I informed @gabrielenava already in June in a mail about this. But I got never an answer. I also wanted to announce it here earlier, but I did not had time for this, since I was in Vienna.

For the experiments I need a real stable and fast system, since the iteration-process of the learning algorithm can take hours in the extreme case. A sporadic crash of the mex-subroutine would cause time loss and the experiment must be completely repeated.

Furthermore, when this package is complete and fully documented, the Technical University of Vienna is thinking about it to use this package for the education.

So what is new now:

As mentioned in some discussions here in the forum, the mex-subroutine has still troubles with the compilations on other operating systems and very often the CTest fails at the end with segmentation fault errors.

Since I needed for my project additionally functions in the mex-subroutine, I've started to add all missing functions of the Whole Body Interface for YARP-based robots. In this time I analyzed also the complete code and found some issues and changed them in my branch.

What I have done there:

During the cleanup-process I have found a hidden bug in the destructor of the ModelState class. This class describes a singleton pattern with some static attributes. The CTest will always fail at some point in the end or Matlab will crash sporadically, since the destructor of this class is not well made.

I was able to "repair" the destructor, but not 100 percently. I have still 2 persistent problems there. Namely, 2 allocated arrays (qjS and qjDotS) which I'm not able to free them at the moment. But I will solve this problem after the vacations. Then the problems with the CTest and the sporadic segmentation faults should be history.

In the last months the WBM-Class was growing really much. Now it has so many functions that we can speak about a real Toolbox for general purposes.

So what does this Toolkit/Package offer:

What is still missing:

If somebody has time, it would be nice if somebody can try to compile the C++ code under Windows or MacOS X and give me some feedback about it. Thank's!

P.S. @traversaro: I have found a hidden function in the mex-subroutine called "visualize-trajectory". I tried to call it with the right parameters, but then it will hang since, I guess, it is calling methods which I didn't have installed on my computer. Do you know what this hidden function is exactly doing and what it needs? Or maybe @naveenoid has time to explain it to me, then I have enough information also for the documentation.

Ganimed commented 8 years ago

Hi guys,

yesterday I had a meeting via Skype with @VModugno in France. I want to tell you that this work is now going to find an end! Finally! The toolbox is now nearly complete, except some small parts as I have mentioned in the previous post.

To finish this package I have only to finish the C++ part and some small functions where I'm not sure if they are calculating the values correctly. Or in other words, I'm not sure if my thoughts about the calculation is correctly. I asked @VModugno about it and he told to me that he is only an "end-user" and these questions are for him to much low-level and so I should ask you.

So in following functions I have still some doubts:

@traversaro and @gabrielenava you are the only ones who have seen the code of me. Maybe somebody of you has one hour time in this week or short after the vacations, to review these methods with me together. Then I can make the needed changes and then the toolbox is finished! I will write an email to you then we can make an appointment at the IIT. Thank you very much in advance!

traversaro commented 8 years ago

@Ganimed I can handle this. This week I am quite busy for a deadline, are you available next week?

Ganimed commented 8 years ago

@traversaro: Great!! The next week I'm still here. Thank you very much! So should I write an e-mail to you for making an appointment?

traversaro commented 8 years ago

Yes, please send me an email, thanks.

Ganimed commented 8 years ago

@traversaro did you received my e-mail? I've sent it with my account from my University. Thank's.

traversaro commented 7 years ago

@Ganimed @gabrielenava I noticed you changed the name and signatures (i.e. the order of returned elements) of several matlab functions.

Differently from variables name changes, these are quite an invasive set of changes that would cause existing codebase using mex-wholebodymodel functions to stop working. Note that as part of the collaboration with TUE, we have at least an instance of code using mex-wholebodymodel.

I am ok with the changes, but you should document all this changes somewhere, for example in a file named Migration.md (example of a possible structure: https://bitbucket.org/osrf/gazebo/src/8d35d7d913a368dc6d39105fa42c4ff05a4e858e/Migration.md?at=default&fileviewer=file-view-default ). Otherwise, migrating the TUE code to the new version of mex-wholebodymodel would be an really tedious and error-prone process...

gabrielenava commented 7 years ago

I think we can document all the major changes in the already existing file CHANGELOG.md

traversaro commented 7 years ago

Definitly, I totally forgot about that.

traversaro commented 7 years ago

Given that this is quite a big change, I would recommend tagging a version of the repo right before this big merge.

Ganimed commented 7 years ago

Hi guys,

this is a new status update. Since the last report there has been changed really a lot! I made a complete refactoring of the C++ mex-subroutinte, i.e. it was a complete "facelifting".

I have removed all redundant variables, unnecessary headers and libraries. There are no dependencies anymore on the boost and the Eigen library. I made also several adaptions and code-style unifications in the subroutine and also in the wrapper-functions. Now the variable and function names are unified from the top until to the bottom in the C++ code of the mex-subroutine!

Changes in the new wbm-functions:

To be backward compatible, I have put all old wbm-functions into the new folder old.

Furthermore, I was able to solve "more or less" the destructor-problem in ModelState. In the end I must say "more or less", because there is still a strange behaviour in Matlab which I don't really understand.

At the moment I'm working with Matlab R2015b and R2016b under Ubuntu 14.04. For the code compilation I use the GCC of version 4.8.5. I have compiled it in the DEBUG mode, to see all outputs and steps. The problem is still in the destructor of the ModelState class.

Mostly the mex-subroutine passes the CTest in Matlab R2015b and R2016b like below,

starting integration
Relative error for energy for model iCubV2.5.urdf: 0.000003
Deleting older version of robotmexWholeBodyModel started with robot loaded from urdf file : twoLinks.urdf, Num of Joints : 1 
Robot name set as robotLoadedDirectlyFromURDFfile
fwdDynKinEnergyTest: Random Initial configuration
    1.2589

fwdDynKinEnergyTest: Random Initial velocity
   -4.0579

starting integration
Relative error for energy for model twoLinks.urdf: 0.000001
kinEnergyConservation tests passed
mex-wholebodymodel tests: all test completed successfully!

but in the end, after passing all the tests, the subroutine will be unloaded from the system and a segmentation fault will be caused.

If you call the mex-subroutine in a normal procedure like "timeTest.m", and afterwards unload it from the Matlab's memory manager with the clear all command, then the deallocation of the arrays sqj and _sqjdot in the ModelState class works flawlessly and the subroutine terminates in a clean way.

But the strange thing is, the same doesn't work under the CTest. After freeing the robotWBIModel, the system is not able to free the following arrays sqj and sqj_dot. It doesn't matter if they are static or not. That's really a strange behaviour!

Furthermore, outside of the mexWholeBodyModel subroutine, the test methods in Matlab causing also some sporadic segmentation faults. Especially during the integration processes. I suspect in this case that some random generated numbers are out of scope and causing these effects.

To minimize the number of errors and their side effects it is recommendable to make a

However the situation is now better than before and the mex-subroutine is more stable. For me at the moment it is stable enough for my experiments.

Benchmarks:

I have checked out older versions of the C++ code and I made a simple benchmark with the timeTest.m and compared it afterwards with the actual version. My computer has a i7 processor and the time difference with the new version compared with the old version is following:

Number of iterations: 10000

The time results may be different on other machines. If all variables are static we gain around 0.5 close to 1 sec. of speed for 10000 iterations. That's not really much, but it will be count if the operations are used in a complex long-term simulation where the calculation time takes 10, 20 hours or days.

Ganimed commented 7 years ago

When I was writing the report, I have recognized that your comments. I totally agree with you that these are big changes, but necessary that everything is uniform with the other functions and also with the WBM classes.

As you have mentioned above, we should make in the change-log an announcement that there is a new version of the mexWholeBody subroutine with several big changes.

We should also inform the TUE via e-Mail that there is coming soon a new version with a lot of changes. But before we are merging, we should make several tests and improving (cleanup) the CTest files.

gabrielenava commented 7 years ago

I pushed a recovery branch that is just a copy of the current master branch. I would also move all controllers in another branch called WBM-controllers. This to separate the toolbox (mex-wholebodymodel) from what use the toolbox (the controllers).

Ganimed commented 7 years ago

Sounds good! A recovery branch is a good idea. Also the idea for the controllers. Maybe it is better for the future to put the controller collection in an own repository called "WBM-Controllers". Then it is also physically split from the WBM-Toolbox/Libary.

gabrielenava commented 7 years ago

I agree, but I think this might create confusion with the other matlab Toolbox (WBToolbox) and Simulink controllers (WBIToolbox-controllers). I actually had the idea of renaming mex-wholebodymodel into WBMToolbox, but as I told you this will create confusion. Even if maybe renaming it matlab-WBMToolbox might avoid confusion. But this is "premature optimization" [cit.]

Ganimed commented 7 years ago

Maybe it is better to call it Library instead of Toolbox, but these are details and can be discussed afterwards.

traversaro commented 7 years ago

To be backward compatible, I have put all old wbm-functions into the new folder old.

Please, don't do it. For history we will always have the git repository, and not maintained untested files floating over the repo are not going to help anyone. Documenting the changes in the Change Log is more then ok for our use.

The time results may be different on other machines. If all variables are static we gain around 0.5 close to 1 sec. of speed for 10000 iterations. That's not really much, but it will be count if the operations are used in a complex long-term simulation where the calculation time takes 10, 20 hours or days.

Speedup is typically given as a percentage of the original time. What is the time for 10000 iterations before and after?

If you call the mex-subroutine in a normal procedure like "timeTest.m", and afterwards unload it from the Matlab's memory manager with the clear all command, then the deallocation of the arrays sqj and sqj_dot in the ModelState class works flawlessly and the subroutine terminates in a clean way. But the strange thing is, the same doesn't work under the CTest. After freeing the robotWBIModel, the system is not able to free the following arrays sqj and sqj_dot. It doesn't matter if they are static or not. That's really a strange behaviour!

The lifecycle of static objects is a dangerous beast (one of the reason static classes should not be used in the first place. : ) ). My guess is that everything works correctly when doing clear all, but there is a problem when exiting matlab with exit(). If that is the case, it should be sufficient to add clear all before the two calls to exit in https://github.com/robotology/mex-wholebodymodel/blob/master/tests/runWBMTests.m#L20 . However, it also possible that the same problem occurs when you close Matlab, but you don't see it because Matlab is not launched from a terminal.

Ganimed commented 7 years ago

The total time for 10000 iterations was approx. 8 seconds for the old version. With the new version the total time was around 5.1 and 5.3 seconds.

Ganimed commented 7 years ago

Maybe the direct exit() command causes the problem. We must check it.

Ganimed commented 7 years ago

Yesterday I have found the bug which is causing the segmentation fault errors! In the meantime I have fixed this error and now the CTest is working flawlessly!

The bug was not in the C++ code of the mex-subroutine, neither a bug of Matlab. The problem was in the test files for Matlab!! If you change the name of the sub-routine, PLEASE change also the name in the Matlab-code!! Especially if you call "clear _name_of_mexsubroutine"!!! If the name is wrong written, then Matlab is trying to remove a subroutine from its memory management system which doesn't exist! Thus, Matlab's memory management is getting corrupted and we get all the evil side effects with the sporadic segmentation faults in any mex-subroutines of Matlab incl. our mex-subroutine!

traversaro commented 7 years ago

@Ganimed what was the problem in the end? In your comment there are ~10 exclamation marks, but no reference to the commit that actually fixed the problem. : )

Ganimed commented 7 years ago

The reason of these problems was, that in the code of kinEnergyConservationTestAllModels.m was called several times clear wholeBodyModel. But since months ago the mex-subroutine was renamed to mexWholeBodyModel. After I have fixed this line (see: https://github.com/robotology/mex-wholebodymodel/pull/62/commits/8c3d0ac2adcd72dd3a908dc5b277d4c93a32e17c#diff-5a15e3527e541b888c8ddfadc2b6d45e), the CTest was working.

Every time when the integration procedure was called in kinEnergyConservationTest.m, at first the "wrong" clear wholeBodyModel was executed, followed by the wbm and matlab-functions. So every time (see in kinEnergyConservationTestAllModels.m), the memory of the real mexWholeBodyModel subroutine was not cleared and was initialized again with a new URDF file for testing. I guess, that the continuous initialization with different robot models and with different DOFs has corrupted the memory of our mex-subroutine and in addition it had affected also the memory management system of Matlab.

Ganimed commented 7 years ago

I have improved the internal memory management of the mexWholeBodyModel subroutine and fixed in the same time a bug which will arise during the initialization of a new robot model. Now the mex-subroutine can be initialized with a sequence of models, without calling the clear command in between.

gabrielenava commented 7 years ago

I will close this pull request since WBM-Class has been already merged in updateRepo branch