sofa-framework / sofa

Real-time multi-physics simulation with an emphasis on medical simulation.
https://www.sofa-framework.org
GNU Lesser General Public License v2.1
871 stars 297 forks source link

[all] Fix the cause of failling test scenes (for a greener dashboard) #548

Closed damienmarchal closed 6 years ago

damienmarchal commented 6 years ago

This PR:

Reviewers will merge only if all these checks are true.

hugtalbot commented 6 years ago

I did not have a lot of time to progess here but still a reminder for later:

ErwanDouaille commented 6 years ago

@damienmarchal I was looking for errors with this scene :

./runSofa -g batch -s dag -n 100 /home/douaille/SOFA/fixTestScene/applications/plugins/Flexible/examples/demos/steak.scn

The segfault comes from : https://github.com/sofa-framework/sofa/commit/e65ef095fd96b69b3c9036b2e839f9429b4b022b#diff-179a1b721f69963e4d1fd5ee846ffe22R57

What does this line ? I tried remove it but exception_mode is used everywhere in Cimg.h and for its next call it crash.

damienmarchal commented 6 years ago

@ErwanDouaille I look at the line/commit you pointed...but un don't understand why you are pointing to a specific commit that didn't changed that specific line.

ErwanDouaille commented 6 years ago

@damienmarchal Because it crash at this line, no specific reason for commit sry, I was investigating ;) I don´t understand what does this line do. And why does it crash there

damienmarchal commented 6 years ago

Ok thank for the extra details. After a quick glance this line is changing the way CImg is handling error. The problem is that it tries to lock mutex which somehow cause the crash. My current guess is that if it could be related to the fact that there is either image & CImage plugin which are both using CImg...which have static initialized structure...just a guess...

EDIT: mmm I'm deeply thinking...in CImg

#if defined(cimg_module)
    Mutex_info& Mutex_attr();
#elif defined(cimg_main)
    Mutex_info& Mutex_attr() { static Mutex_info val; return val; }
#else
    inline Mutex_info& Mutex_attr() { static Mutex_info val; return val; }
#endif
hugtalbot commented 6 years ago

Just revert the commit on CG since PR #556 will do the job.

hugtalbot commented 6 years ago

Hi, as scheduled I just reverted the last commit already fix or redundant with #556

guparan commented 6 years ago

@damienmarchal, @fredroy: Why isn't CImg directly inside CImgPlugin? We would then just have to depend on CImgPlugin to use CImg.

damienmarchal commented 6 years ago

In my fix I initially didn't wanted to change too much existing things in Sofa but as It is in fact a complex issue I think that guillaume suggestion to have CImg in CImgPlugin is a very good one and I will give it a try.

hugtalbot commented 6 years ago

yes seems a good point, would you like to make a new PR for this ?

damienmarchal commented 6 years ago

[ci-build][with-scene-tests]

damienmarchal commented 6 years ago

Dashboard is showing good progress, only one scene crash on ubuntu & centos in an easy to fix issue. Still one unknown test-failure on centos (but it may be a configuration problem related to LANG). One test failure on windows is still because this machine does not have libjpeg/libtiff installed (or we need a full build ?). Three are test problem

The big dark spot is now the 26 remaining crashes ? Can someone with MacOS investigate the issues please ?

untereiner commented 6 years ago

A quick test of this branch on macos:

There is call to glGetIntegerv(GL_READ_FRAMEBUFFER_BINDING, &windowId); who ends to a segfault.

damienmarchal commented 6 years ago

Thanks for the feedback @untereiner maybe it is because in batch mode there is no opengl context.
If I remember correctly this crash macos's tests that way.

Can you providing line/file that cause problem ?

damienmarchal commented 6 years ago

[ci-build][with-scene-tests]

damienmarchal commented 6 years ago

[ci-build][with-scene-tests]

guparan commented 6 years ago

caduceus.scn crashes on both macos_default and macos_options on master when running in batch mode.
See https://ci.inria.fr/sofa-ci/job/mac_clang-3.4_default/1904/warnings2Result/category.94921639/package.-1174366838/

damienmarchal commented 6 years ago

@guparan Thank for the report. As pointed by @untereiner ...the problem is related to the batch mode & opengl function on macos.
My guess is that there is not opengl context and that there is code in sofa that use opengl functions.

untereiner commented 6 years ago

Below is the stack. I do not understand why the visual objects are created in the batch mode. PS: do a search of "batch" in sofa...

frame #0: 0x00007fff35526c44 libGL.dylibglGetIntegerv + 18 frame #1: 0x000000011ef3eac6 libSofaHelper_d.17.12.dev.dylibsofa::helper::gl::FrameBufferObject::getCurrentFramebufferID() at FrameBufferObject.cpp:79 frame #2: 0x00000001176c0882 libSofaOpenglVisual_d.17.12.dev.dylibsofa::component::visualmodel::Light::Light(this=0x000000012080c800, vtt=0x0000000117a0a620) at Light.cpp:86 frame #3: 0x00000001176c6b7b libSofaOpenglVisual_d.17.12.dev.dylibsofa::component::visualmodel::PositionalLight::PositionalLight(this=0x000000012080c800, vtt=0x0000000117a0a618) at Light.cpp:633 frame #4: 0x00000001176c84cf libSofaOpenglVisual_d.17.12.dev.dylibsofa::component::visualmodel::SpotLight::SpotLight(this=0x000000012080c800) at Light.cpp:698 frame #5: 0x00000001176d7ae9 libSofaOpenglVisual_d.17.12.dev.dylibsofa::core::objectmodel::New::New<>(this=0x00007ffeefbf59c8) at SPtr.h:57 frame #6: 0x00000001176d79e5 libSofaOpenglVisual_d.17.12.dev.dylibsofa::core::objectmodel::New<sofa::component::visualmodel::SpotLight>::New<>(this=0x00007ffeefbf59c8) at SPtr.h:57 frame #7: 0x00000001176d789f libSofaOpenglVisual_d.17.12.dev.dylibsofa::component::visualmodel::SpotLight::SPtr sofa::core::objectmodel::BaseObject::create((null)=0x0000000000000000, context=0x0000000122011a00, arg=0x0000000121261df0) at BaseObject.h:100 frame #8: 0x00000001176d7763 libSofaOpenglVisual_d.17.12.dev.dylibsofa::core::ObjectCreator<sofa::component::visualmodel::SpotLight>::createInstance(this=0x000000012071ba90, context=0x0000000122011a00, arg=0x0000000121261df0) at ObjectFactory.h:209 frame #9: 0x000000011bb62cb8 libSofaCore_d.17.12.dev.dylibsofa::core::ObjectFactory::createObject(this=0x000000011c7be9f0, context=0x0000000122011a00, arg=0x0000000121261df0) at ObjectFactory.cpp:186 frame #10: 0x000000011aca2c54 libSofaSimulationCommon_d.17.12.dev.dylibsofa::core::ObjectFactory::CreateObject(context=0x0000000122011a00, arg=0x0000000121261df0) at ObjectFactory.h:157 frame #11: 0x000000011aca0a65 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::xml::ObjectElement::initNode(this=0x0000000121261df0) at ObjectElement.cpp:77 frame #12: 0x000000011aca0177 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::xml::ObjectElement::init(this=0x0000000121261df0) at ObjectElement.cpp:60 frame #13: 0x000000011ac8b407 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::xml::BaseElement::init(this=0x000000012125f6e0) at BaseElement.cpp:149 frame #14: 0x000000011ac9de25 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::xml::NodeElement::init(this=0x000000012125f6e0) at NodeElement.cpp:78 frame #15: 0x000000011ac75d15 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::SceneLoaderXML::processXML(xml=0x000000012125f6e0, filename="../../sofa/examples/Demos/caduceus.scn") at SceneLoaderXML.cpp:117 frame #16: 0x000000011ac75479 libSofaSimulationCommon_d.17.12.dev.dylibsofa::simulation::SceneLoaderXML::load(this=0x00000001212063e0, filename="../../sofa/examples/Demos/caduceus.scn") at SceneLoaderXML.cpp:79 frame #17: 0x000000011aeafa79 libSofaSimulationCore_d.dylibsofa::simulation::Simulation::load(this=0x000000012200d800, filename="../../sofa/examples/Demos/caduceus.scn") at Simulation.cpp:470 frame #18: 0x00000001000213fc runSofa_dmain(argc=8, argv=0x00007ffeefbff928) at Main.cpp:381 frame #19: 0x00007fff536e1115 libdyld.dylibstart + 1 frame #20: 0x00007fff536e1115 libdyld.dylib`start + 1

damienmarchal commented 6 years ago

@untereiner I think that this is because batch mode does not "remove" element from the scene (it would be a nighmare). To me the underlying problem is that the batch mode should have a valid opengl context created (off-screen rendering) as we use it to test file with opengl based components.

Thanks for the stack trace... I will try to make something about it (even on macos).

untereiner commented 6 years ago

Launching an opengl context "by hand" is complicated if I remember correctly. I suggest that the batch mode uses GLEW to handle the opengl context. In the other options it is Qt that handles the context creation. But I do not understand why it works on linux.

damienmarchal commented 6 years ago

It is definitively hard to have an opengl context by hand...and having one for offscreen (without a windows) seems even more tricky (anyone's help is welcome). I know @ErwanDouaille search for that in his HeadLessRecorder

I assume it work on linux/windows...because they just don't crash/segfault when gl function are called without a buffer :) As said in: https://www.opengl.org/discussion_boards/showthread.php/158904-OpenGL-function-calls-without-available-contexts

Too bad... I tried a lot of approach based on QOffscreenSurface and friends, they work on my system but as soon as I move that to a CI machine (which does not have X or GLX or whatever)...the application crash. I reseted the branch but you can still see the garbage in the Dashboard. I will try a different approach tomorrow.

damienmarchal commented 6 years ago

[ci-build][with-scene-tests]

damienmarchal commented 6 years ago

Ho damn...two full build + scene that are green builds... and not one to cheers (deep sadness) https://www.sofa-framework.org/dash/?branch=pr/fixTestScene

damienmarchal commented 6 years ago

[ci-build][with-scene-tests]

guparan commented 6 years ago

Wow, such green, very stable, many thanks! ;-)

damienmarchal commented 6 years ago

More seriously...I don't know how to fix the MacOS version (and I think the batch mode is fundamentally broken :)).

Tomorrow I will clean the PR, possibly cutting it in pieces because I doubt guys here will accept to merge it "as it is" :)

guparan commented 6 years ago

@epernod Could you quick review the commit 47ef2e63a80c please? How could CubeTopology.scn have been passing scene-tests without this fix? (request from @damienmarchal)

guparan commented 6 years ago

This PR seems quite empty now. Shall we close it @damienmarchal ?

damienmarchal commented 6 years ago

Yes, Thank for the help in managing all that.

guparan commented 6 years ago

Thank you for your hard work! :-)