bulletphysics / bullet3

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
http://bulletphysics.org
Other
12.63k stars 2.88k forks source link

App_Bullet3_OpenCL_Demos Segmentation fault #200

Closed sabotage3d closed 8 years ago

sabotage3d commented 10 years ago

Hello ,

I am getting segmentation fault when I try the OpenCL Demos from this git commit: e317b646430e8601ffa0cdddf3dfd77a56a200c7

This is my log:

main startCreating context
Created GL 3.0 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=Quadro K4000/PCIe/SSE2
GL_VERSION=3.2.0 NVIDIA 319.60
GL_SHADING_LANGUAGE_VERSION=1.50 NVIDIA via Cg compiler
pthread_getconcurrency()=0
-----------------------------------------------------
Segmentation fault
odellus commented 10 years ago

I'm having a simliar issue testing the build from https://github.com/bulletphysics/bullet3/commit/46bd05f4f769ae0c6229489bb515ea6e471f17b7.

_clew_gmake_x64_release
main startglewXInit dynamically loaded using dlopen/dlsym OK
glewXInit OK
Creating context
Created GL 3.0 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=ATI Technologies Inc.
GL_RENDERER=AMD Radeon HD 6900 Series 
GL_VERSION=3.2.12874 Core Profile/Debug Context 14.10.1006.1001
GL_SHADING_LANGUAGE_VERSION=4.30
pthread_getconcurrency()=0
-----------------------------------------------------
App_Bullet3_OpenCL_Demos_clew_gmake_x64_release: ../../btgui/OpenGLWindow/LoadShader.cpp:21: 
GLuint gltLoadShaderPair(const char*, const char*): Assertion `glGetError()==0' failed.
started GwenUserInterfaceAborted (core dumped)

The only other test in bullet3/bin that had any problems was bin/Test_OpenCL_Bullet3_gmake_x64_release. The output from running this test is located below:

https://gist.github.com/odellus/e8751fd7aafec6e8d73c

I'm looking more carefully into this issue at the moment, but I must admit I am brand new to Bullet so any help would be most well received.

erwincoumans commented 10 years ago

@odellus this seems to be some GLSL shader loading issue. We should make sure that the glGetError also shows the GLSL shader compiler error. Separately: there are many possible combinations of driver, operating system and OpenCL compiler version, and each of them can cause issues with a OpenCL kernel. I don't have bandwidth to provide support for all buggy OpenCL drivers out there. At this point, the best you can do is learn to fix OpenCL kernels by yourself :-)

odellus commented 10 years ago

Thanks for the feedback. Will do. :-)

erwincoumans commented 10 years ago

@odellus Actually the assert happens right before loading a glsl shader. What operating system + version are you using? Is it a beta graphics GPU driver or recent non-beta one? Is this the latest github source code revision you are using (from http://github.com/bulletphysics/bullet3) ?

odellus commented 10 years ago

I'm using Ubuntu 14.04 with ATI Catalyst 14.4 (non-beta). The problem might(?) come from the fact that I also have an nVidia card beside my ATI, so I have drivers from both nVidia and ATI installed. I can run OpenCL on both of them for most applications and this hasn't been an issue since getting both cards to work initially (which was non-trivial).

The source I was building came from yesterday's master branch.

I meant to ask you which set of drivers and OS are fully supported by bullet3 at this time. I understand not having the bandwidth to deal with everyone's buggy drivers. When I started using the catalyst drivers in Ubuntu a few years ago they were terrible, and they haven't really improved that much since then.

alda30 commented 10 years ago

@odellus I have the same issue. Could you please keep us updated if you have found the solution? I am running Ubuntu 14.04 (64bit) with GTX Titan graphics card (the only one I have). Thanks

odellus commented 10 years ago

https://gist.github.com/odellus/c32a258d521aee982fb1

I updated the catalyst drivers to the latest stable branch and now there is only one error:

from the gist: compiling kernel mprPenetrationKernel ready. ../../test/OpenCL/AllBullet3Kernels/testExecuteBullet3NarrowphaseKernels.cpp:414: Failure Value of: numContacts Actual: 0 Expected: results[i] Which is: 1

So it's only off by one! :smiley: So close. I'm pretty sure users just need to scrape the money together for a refurbished GTX 680 if they don't have a more modern card that Bullet3 is known to work with. This is what I am doing.

Nican commented 10 years ago

The problem is coming inside of LoadShader.cpp:21. I do not quite understand why it is checking for errors in the start of the function, but glGetError() has a value of GL_INVALID_ENUM (1280).

I am trying to search, but can not seem to easily find it. Do you know where calls to OpenGL is being called before that?

My configuration:

Ubuntu 14.04 64-bit GL_VENDOR=NVIDIA Corporation GL_RENDERER=GeForce GTX 970/PCIe/SSE2 GL_VERSION=3.2.0 NVIDIA 343.22

UPDATE:

727     glewInit();
(gdb) next
730     gui = new GwenUserInterface();
(gdb) call glGetError()
$12 = 1280

The call for glewInit() seems to be doing it. ;o

UPDATE2: I found the source of the problem, this line is setting the error:

extStart = glGetString(GL_EXTENSIONS);

And here is a solution: http://stackoverflow.com/questions/19453439/solved-opengl-error-gl-invalid-enum-0x0500-while-glewinit#comment28878875_19466432

I set glewExperimental = true, and it worked for a minute or so, but then it crashed at GLInstancingRenderer.cpp:1835, again with glGetError() as GL_INVALID_ENUM.

UPDATE3: I am not sure what to make of this anymore. It works fine for most of the time, but I seem to getting random exceptions:

App_Bullet3_OpenCL_Demos_clew_gmake_x64_debug: ../../src/Bullet3OpenCL/ParallelPrimitives/b3OpenCLArray.h:283: void b3OpenCLArray::copyToHostPointer(T*, size_t, size_t, bool) const [with T = b3Vector3; size_t = long unsigned int]: Assertion `status==0' failed.

Any help?

UPDATE4: Seems to be working fine for the most time, except for the random crash here and there. Also, learned a lot about OpenGL, recommend reading: https://developer.nvidia.com/sites/default/files/akamai/gamedev/docs/Porting%20Source%20to%20Linux.pdf and ARB_debug_output

Flix01 commented 9 years ago

Just wanted to report that on Unbuntu 64bit 15.04, I experienced the same (or a similiar) issue:

./App_Bullet3* main startCreating context Created GL 3.0 context Direct GLX rendering context obtained Making context current GL_VENDOR=NVIDIA Corporation GL_RENDERER=GeForce GT 630/PCIe/SSE2 GL_VERSION=3.2.0 NVIDIA 346.59 GL_SHADING_LANGUAGE_VERSION=1.50 NVIDIA via Cg compiler pthread_getconcurrency()=0 App_Bullet3_OpenCL_Demos_clew_gmake_x64release: ../../btgui/OpenGLWindow/LoadShader.cpp:21: GLuint gltLoadShaderPair(const char, const char_): Assertion `glGetError()==0' failed. started GwenUserInterfaceAborted

I just wanted to report this (because I'm just interested in Bullet2 actually...).

However : Test_OpenCL_Bullet3_gmake_x64_release, Test_OpenCL_primitives, etc. all run correctly, and even Test_Gwen_OpenGL_gmake_x64_release works, even if I got a strange output when I close the window:

visual 0x24 selected GL_VENDOR=NVIDIA Corporation GL_RENDERER=GeForce GT 630/PCIe/SSE2 GL_VERSION=4.5.0 NVIDIA 346.59 GL_SHADING_LANGUAGE_VERSION=4.50 NVIDIA pthread_getconcurrency()=0 XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 5557 requests (5557 known processed) with 0 events remaining.

erwincoumans commented 9 years ago

The demos are all refactored into 'examples/ExampleBrowser', and this sets 'glewExperimental=true;' Can you please give the latest revision at http://github.com/bulletphysics/bullet3 a try and report back?

erwincoumans commented 9 years ago

By the way, I disabled OpenCL support in the ExampleBrowser by default, but you can enable it using a command-line flag: App_ExampleBrowser_gmake --enable_experimental_opencl I fixed another NVIDIA/OpenCL issue, one kernal was reading an array out-of-bounds.

Flix01 commented 9 years ago

Yes, the ExampleBrowser works (I've just tested the bullet3-2.83 release package).

But now I can't find App_Bullet3_OpenCL_Demos_clew_gmake_x64_release anymore. The demos inside "ExampleBrowser" all seem to be related to Bullet2, isn't it ? So now the bullet3(=GPU support) demos are only accessible through the --enable_experimental_opencl command line flag (i.e Box-Box and Pair Bench). Is that correct ?

[Edit:]Yes. I've just found out myself what you wrote in another post:

I'm refactoring the demos/examples at the moment, and consider what to do with the experimental Bullet3/OpenCL work. My focus is currently higher-quality rigid body dynamics for robotics. There isn't anyone working on the OpenCL/GPU acceleration at the moment, so I likely mark it as 'experimental'. One of the on-going issues with OpenCL is lack of stable drivers and proper debugging tools. I plan merge some of the OpenCL demos in the new 'ExampleBrowser', and disable their compilation by default (you will be able to enable the 'OpenCL' examples in the build system, premake and cmake).

And since I'm happy with Bullet2, I'm glad it keeps improving...

erwincoumans commented 9 years ago

I likely re-enable other OpenCL demos too. I just wanted to get 'some' 2.83 release tagged, and cleaned up some mess into a single Example Browser.

If you want, you can try to enable a few other OpenCL demos. In fact, it should be almost trivial to enable a few more demos in the examples/OpenCL/rigidbody/GpuConvexScene.cpp file (next to box-box). The other files in examples/OpenCL/rigidbody/ need a bit more conversion work, not too complex.

So using 'clew' we dynamically load the OpenCL runtime, that's why I enabled compiling the OpenCL demos it by default: users don't need to install an OpenCL sdk to compile the Bullet demos. I just disabled running them by default, otherwise too many people start complaining that those OpenCL demos don't work, or don't work fast etc. Buggy OpenCL compilers/driver should be send at AMD and NVIDIA etc, not to me :-)

Thanks!

erwincoumans commented 8 years ago

If you use the premake build system (bullet/build2) then the example browser has a command-line flag --enable_experimental_opencl that let's you experiment with a few OpenCL examples. This is not enabled in CMake, just in premake. Not all OpenCL examples have been ported (yet). Let's close this for now. If you have time and enable some more OpenCL examples in the example browser, please create a new pull request and use premake.