odin-detector / odin-data

DAQ software libraries for capturing and storing data from parallel detector systems
https://odin-detector.github.io/odin-data/
Apache License 2.0
8 stars 11 forks source link

FrameReceiver hangs on startup if its shared memory buffer exists, but it does not have permissions to use it #332

Open GDYendell opened 8 months ago

GDYendell commented 8 months ago

To reproduce:

This is the traceback when run in gdb and killed:

(gdb) bt
#0  0x00007ffbca533efd in open64 () from /lib64/libpthread.so.0
#1  0x00007ffbc95b85a2 in shm_open () from /lib64/librt.so.1
#2  0x00007ffbcb5a0135 in boost::interprocess::shared_memory_object::priv_open_or_create (this=0x186c4a8,
    type=boost::interprocess::ipcdetail::DoOpenOrCreate, filename=0x1812c58 "odin_buf_1", mode=boost::interprocess::read_write, perm=...)
    at /usr/include/boost/interprocess/shared_memory_object.hpp:325
#3  0x00007ffbcb59fec1 in boost::interprocess::shared_memory_object::shared_memory_object (this=0x186c4a8, name=0x1812c58 "odin_buf_1",
    mode=boost::interprocess::read_write, perm=...) at /usr/include/boost/interprocess/shared_memory_object.hpp:72
#4  0x00007ffbcb59f164 in OdinData::SharedBufferManager::SharedBufferManager (this=0x186c490, shared_mem_name="odin_buf_1", shared_mem_size=32000000000,
    buffer_size=17943136, remove_when_deleted=true) at /dls_sw/work/odin/eiger-watchdog-warning/odin-data/cpp/common/src/SharedBufferManager.cpp:21
#5  0x00000000004950a1 in FrameReceiver::FrameReceiverController::configure_buffer_manager (this=0x1812500, config_msg=...)
    at /dls_sw/work/odin/eiger-watchdog-warning/odin-data/cpp/frameReceiver/src/FrameReceiverController.cpp:626
#6  0x0000000000491fbb in FrameReceiver::FrameReceiverController::configure (this=0x1812500, config_msg=..., config_reply=...)
    at /dls_sw/work/odin/eiger-watchdog-warning/odin-data/cpp/frameReceiver/src/FrameReceiverController.cpp:104
#7  0x0000000000469c4f in FrameReceiver::FrameReceiverApp::run (this=0x7ffd6b03d3e0)
    at /dls_sw/work/odin/eiger-watchdog-warning/odin-data/cpp/frameReceiver/src/FrameReceiverApp.cpp:382
#8  0x000000000046a45d in main (argc=12, argv=0x7ffd6b03d588)
    at /dls_sw/work/odin/eiger-watchdog-warning/odin-data/cpp/frameReceiver/src/FrameReceiverApp.cpp:456

A quick search doesn't give many results on this problem.

wnichols1 commented 8 months ago

I would like to add that this actually happened when a different user had not closed the FR properly and the shmem was reserved in his name, and I could not unlock it. I think the FR needs to terminate with an error if it can not allocate and own the shmem with the specified name.