RenderKit / ospray

An Open, Scalable, Portable, Ray Tracing Based Rendering Engine for High-Fidelity Visualization
http://ospray.org
Apache License 2.0
997 stars 182 forks source link

OSPRay 2.1.1, Termination-time crash w/ shared ptrs, similar to issue #355 #420

Closed tachyon-john closed 3 years ago

tachyon-john commented 4 years ago

Compiling and testing VMD on CentOS 8 w/ the stock GCC 8.3.1, I'm seeing termination time crashes similar to what was reported in issue #355, but in my case the offender is in libopenvkl.so:

Thread 1 "vmd_LINUXAMD64" received signal SIGSEGV, Segmentation fault. 0x00007f4da2ce357c in std::_Sp_counted_ptr<openvkl::api::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /usr/local/lib/vmdtest2/libopenvkl.so.0 (gdb) where

0 0x00007f4da2ce357c in std::_Sp_counted_ptr<openvkl::api::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /usr/local/lib/vmdtest2/libopenvkl.so.0

1 0x00007f4da2ce86d9 in std::shared_ptr::~shared_ptr()

() from /usr/local/lib/vmdtest2/libopenvkl.so.0

2 0x00007f4dec00313c in __run_exit_handlers () from /lib64/libc.so.6

3 0x00007f4dec003270 in exit () from /lib64/libc.so.6

4 0x00007f4debfec87a in __libc_start_main () from /lib64/libc.so.6

5 0x0000000000505c5e in _start ()

johguenther commented 4 years ago

Thanks for reporting. We look into it, should be fixed in the next release.

tachyon-john commented 4 years ago

Let me know if you need any assistance reproducing or not. I noted that in my testing, I actually had to do more than just instantiate an OSPRay 2.x context and creating an empty OSPRenderer to trigger the crash. I'm not sure if it requires just a few more API calls to trigger the failure, but my guess is that I'd have to at least call a couple of ospCommit()/ospRelease() calls for it to fail. My assumption is that you should be able to reproduce this with the example/tutorial apps very quickly, but if not, I'll work with you to make sure you can reproduce it yourself.

tachyon-john commented 4 years ago

Update: OSPRay 2.2.0 has a similar crash libopenvkl:

Thread 1 "vmd_LINUXAMD64" received signal SIGSEGV, Segmentation fault. 0x00007f5c41527393 in rkcommon::memory::IntrusivePtr::~IntrusivePtr() () from /usr/local/lib/vmdtest2/libopenvkl.so.0 (gdb) where

0 0x00007f5c41527393 in rkcommon::memory::IntrusivePtr::~IntrusivePtr() () from /usr/local/lib/vmdtest2/libopenvkl.so.0

1 0x00007f5c8346013c in __run_exit_handlers () from /lib64/libc.so.6

2 0x00007f5c83460270 in exit () from /lib64/libc.so.6

3 0x00007f5c8344987a in __libc_start_main () from /lib64/libc.so.6

4 0x000000000052bbce in _start ()

johguenther commented 4 years ago

Is VMD shutting down OSPRay before exiting? Because we see this happening only of the app just quits.

tachyon-john commented 3 years ago

The same behavior occurs with or without a call to ospShutdown() prior to exit. Running with Address Sanitizer, all I see is (ospShutdown() called within the OSPRay_Global_Shutdown() routine):

vmd > quit
Info) VMD for LINUXAMD64, version 1.9.4a46 (August 15, 2020)
Info) Exiting normally.
OSPRay2Renderer) ~OSPRay2Renderer
OSPRay2Renderer) destroy_scene
OSPRay2Renderer) ~OSPRay2Renderer
OSPRay2Renderer) destroy_scene
OSPRay2Renderer) OSPRay_Global_Shutdown
AddressSanitizer:DEADLYSIGNAL
=================================================================
==121035==ERROR: AddressSanitizer: SEGV on unknown address 0x7f9cf694b1f8 (pc 0x7f9cff127393 bp 0x000000000000 sp 0x7ffe48c25418 T0)
==121035==The signal is caused by a READ memory access.
    #0 0x7f9cff127392 in rkcommon::memory::IntrusivePtr<openvkl::api::Driver>::~IntrusivePtr() (/usr/local/lib/vmdtest2/libopenvkl.so.0+0x1d392)
    #1 0x7f9d1c4d9e9b in __run_exit_handlers (/lib64/libc.so.6+0x39e9b)
    #2 0x7f9d1c4d9fcf in exit (/lib64/libc.so.6+0x39fcf)
    #3 0x7f9d1c4c36a9 in __libc_start_main (/lib64/libc.so.6+0x236a9)
    #4 0x5f23cd in _start (/d1/users/johns/vmd/LINUXAMD64/vmd_LINUXAMD64+0x5f23cd)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/local/lib/vmdtest2/libopenvkl.so.0+0x1d392) in rkcommon::memory::IntrusivePtr<openvkl::api::Driver>::~IntrusivePtr()
==121035==ABORTING
tachyon-john commented 3 years ago

Since I can run the various pre-built OSPray test binaries in bin/ without encountering this behavior, there must be something about how VMD uses OSPRay, or some other clash that is causing this behavior. I note that I don't have this problem with OSPRay 1.x however.

jeffamstutz commented 3 years ago

@tachyon-john given our online discussion with this wrt VMD device lifetimes, we can close this officially. :)