Closed demarle closed 4 years ago
I think the easiest strategy is for us to use ospcommon
ref counted pointers (the same ones we use for other OSPRay objects) instead of std::shared_ptr<>
. We don't need any of the atomic guarantees that std::shared_ptr<>
implementations provide here (and are causing the issue?).
I can get this on release-2.0.x
today.
I switched us over to the ospcommon
pointer types which should solve this in 01a4e8490 (I wasn't able to reproduce this locally). Please let us know if there continue to be any experienced issues.
BTW, that's on release-2.0.x
.
I'm seeing a similar termination-time crash in VMD with OSPRay 2.1.1, using the Intel precompiled libs, compiled on CentOS 8:
Thread 1 "vmd_LINUXAMD64" received signal SIGSEGV, Segmentation fault. 0x00007f4da2ce357c in std::_Sp_counted_ptr<openvkl::api::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /usr/local/lib/vmdtest2/libopenvkl.so.0 (gdb) where
() from /usr/local/lib/vmdtest2/libopenvkl.so.0
We are seeing intermittent crashes on application exit within ospray cleanup that looks like this. pvpython: ../nptl/pthread_mutex_lock.c:433: __pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust' failed.
Loguru caught a signal: SIGABRT Stack trace: 12 0x7f32a076d437 /home/kitware/misc/root/ospray-1.8.4/lib64/libospray.so.0(+0x13437) [0x7f32a076d437] 11 0x7f32a7582c07 __cxa_finalize + 247 10 0x7f32a077cd6f std::shared_ptr::~shared_ptr() + 79
9 0x7f328a944009 ospray::api::ISPCDevice::~ISPCDevice() + 9
8 0x7f328a943e6e ospray::api::ISPCDevice::~ISPCDevice() + 46
7 0x7f3288f4435e rtcReleaseDevice + 30
6 0x7f328a0d95be /home/kitware/misc/root/ospray-1.8.4/lib64/libembree3.so.3(+0x12295be) [0x7f328a0d95be]
5 0x7f32a29afd12 /lib64/libpthread.so.0(+0xad12) [0x7f32a29afd12]
4 0x7f32a7578566 /lib64/libc.so.6(+0x30566) [0x7f32a7578566]
3 0x7f32a756a769 /lib64/libc.so.6(+0x22769) [0x7f32a756a769]
2 0x7f32a756a895 abort + 295
1 0x7f32a757fe75 gsignal + 325
0 0x7f32a757ff00 /lib64/libc.so.6(+0x37f00) [0x7f32a757ff00]
( 1.284s) [main thread ] :0 FATL| Signal: SIGABRT
@utkarshayachit is pretty sure that the issue is the use of static shared pointers to a derived class instance here https://github.com/ospray/ospray/blob/master/ospray/api/Device.h#L36 since he's seen that in other code.
The compiler in question is gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1), which is on ParaView's vall regression test machine.
I am looking into other ways to implement the current device ptr and am open to any tips or suggestions.