epics-base / pvAccessCPP

pvAccessCPP is an EPICS V4 C++ module
https://epics-base.github.io/pvAccessCPP/
Other
10 stars 22 forks source link

Crash in BeaconResponseHandler #155

Open mdavidsaver opened 5 years ago

mdavidsaver commented 5 years ago

@bhill-slac reports a crash in the Beacon handling code. This was observed in a gateway process, but does not appear to be gateway specific (though I'm not certain of this).

In the captured stack trace, three threads are not idle.


Thread 1 (LWP 1214):
#0  0x00007f7e14d32207 in setlocale () from /lib64/libc.so.6
#1  0x00007f7cfc001930 in ?? ()
#2  0x00007f7db4000d00 in ?? ()
#3  0x00007f7dfc515ab0 in ?? ()
#4  0x00007f7e0ce42f47 in __cxxabiv1::__terminate (handler=<optimized out>)
    at /home/nwani/m3/conda-bld/compilers_linux-64_1560109574129/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
#5  0x00007f7e0ce42f7d in std::terminate ()
    at /home/nwani/m3/conda-bld/compilers_linux-64_1560109574129/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
#6  0x00007f7e0ce43988 in __cxxabiv1::__cxa_pure_virtual ()
    at /home/nwani/m3/conda-bld/compilers_linux-64_1560109574129/work/.build/x86_64-conda_cos6-linux-gnu/src/gcc/libstdc++-v3/libsupc++/pure.cc:50
#7  0x00007f7e0d2f4589 in epics::pvData::compare (a=..., b=...) at ../../src/factory/Compare.cpp:86
#8  0x00007f7e0d2b2cfa in epics::pvData::FieldCreate::Helper::cache<epics::pvData::Structure> (
    create=create@entry=0x55992b468310, ent=...) at ../../src/factory/FieldCreateFactory.cpp:63
#9  0x00007f7e0d2abf8a in epics::pvData::FieldCreate::createStructure (this=this@entry=0x55992b468310, fieldNames=..., 
    fields=...) at ../../src/factory/FieldCreateFactory.cpp:1305
#10 0x00007f7e0d2af789 in deserializeStructureField (control=0x55992b987d80, buffer=0x55992b987ea0, 
    fieldCreate=0x55992b468310) at ../../src/factory/FieldCreateFactory.cpp:251
#11 epics::pvData::FieldCreate::deserialize (this=0x55992b468310, buffer=buffer@entry=0x55992b987ea0, 
    control=control@entry=0x55992b987d80) at ../../src/factory/FieldCreateFactory.cpp:1468
#12 0x00007f7e0d664039 in (anonymous namespace)::BeaconResponseHandler::handleResponse (this=<optimized out>, 
    responseFrom=0x7f7dfc515d90, transport=..., version=<optimized out>, command=<optimized out>, 
    payloadSize=<optimized out>, payloadBuffer=0x55992b987ea0) at ../../src/remoteClient/clientContextImpl.cpp:2776
#13 0x00007f7e0d65eb8a in (anonymous namespace)::ClientResponseHandler::handleResponse (this=<optimized out>, 
    responseFrom=<optimized out>, transport=..., version=<optimized out>, command=<optimized out>, 
    payloadSize=<optimized out>, payloadBuffer=0x55992b987ea0) at ../../src/remoteClient/clientContextImpl.cpp:3024
#14 0x00007f7e0d63b51f in epics::pvAccess::BlockingUDPTransport::processBuffer (this=this@entry=0x55992b987d80, 
    transport=..., fromAddress=..., receiveBuffer=receiveBuffer@entry=0x55992b987ea0)
    at ../../src/remote/blockingUDPTransport.cpp:422
#15 0x00007f7e0d63c58b in epics::pvAccess::BlockingUDPTransport::run (this=0x55992b987d80)
    at ../../src/remote/blockingUDPTransport.cpp:271
#16 0x00007f7e0cfd2949 in epicsThreadCallEntryPoint (pPvt=0x55992b9880e0) at ../../src/osi/epicsThread.cpp:83
#17 0x00007f7e0cfd88dc in start_routine (arg=0x55992b988810) at ../../src/osi/os/posix/osdThread.c:403
#18 0x00007f7e150d0dd5 in start_thread () from /lib64/libpthread.so.0
#19 0x00007f7e14df9ead in tdestroy_recurse () from /lib64/libc.so.6
#20 0x0000000000000000 in ?? ()

Thread 70 (LWP 1212):
#0  0x00007f7e150d74ed in __lll_timedwait_tid () from /lib64/libpthread.so.0
#1  0x0000000037efb61d in ?? ()
#2  0x00000000317c54fa in ?? ()
#3  0x00007f7dfc917a30 in ?? ()
#4  0x00007f7dfc9179c0 in ?? ()
#5  0x00007f7dfc917a30 in ?? ()
#6  0x00007f7e0cfda3a5 in epicsThreadOnce (id=0x55992b4683a0, func=
    0x7f7e0cfd5979 <epicsTimeFromTimespec(epicsTimeStamp*, timespec const*)+25>, arg=0x7f7e150d2de6 <_L_lock_870+15>)
    at ../../src/osi/os/posix/osdThread.c:510
#7  0x00007f7e0d2a578b in epics::pvData::FieldCreate::getFieldCreate () at ../../src/factory/FieldCreateFactory.cpp:1592
#8  0x00007f7e0d2a835f in Lock (m=..., this=<synthetic pointer>) at ../../src/misc/pv/lock.h:45
#9  epics::pvData::Field::~Field (this=0x7f7e0cfd3acc <epicsMutex::lock()+12>, __vtt_parm=<optimized out>, 
    __in_chrg=<optimized out>) at ../../src/factory/FieldCreateFactory.cpp:92
#10 0x00007f7e0d2a9a9e in epics::pvData::Structure::~Structure (this=this@entry=0x7f7cfc001930, 
    __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at ../../src/factory/FieldCreateFactory.cpp:527
#11 0x00007f7e0d2a9b09 in epics::pvData::Structure::~Structure (this=0x7f7cfc001930, __in_chrg=<optimized out>, 
    __vtt_parm=<optimized out>) at ../../src/factory/FieldCreateFactory.cpp:527
#12 0x00007f7e0d2b4a5a in _M_release (this=0x7f7cfc00f8a0) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:141
#13 ~__shared_count (this=0x7f7cfc00edf0, __in_chrg=<optimized out>) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:341
#14 ~__shared_ptr (this=0x7f7cfc00ede8, __in_chrg=<optimized out>) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:541
#15 ~shared_ptr (this=0x7f7cfc00ede8, __in_chrg=<optimized out>) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:985
#16 epics::pvData::PVField::~PVField (this=0x7f7cfc00edc0, __vtt_parm=<optimized out>, __in_chrg=<optimized out>)
    at ../../src/factory/PVField.cpp:38
#17 0x00007f7e0d2b6d28 in epics::pvData::PVStructure::~PVStructure (this=this@entry=0x7f7cfc00edc0, 
    __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at ../../src/factory/PVStructure.cpp:63
#18 0x00007f7e0d2b6ea9 in epics::pvData::PVStructure::~PVStructure (this=0x7f7cfc00edc0, __in_chrg=<optimized out>, 
    __vtt_parm=<optimized out>) at ../../src/factory/PVStructure.cpp:63
#19 0x00007f7e0d948f89 in std::tr1::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f7cfc000b70)
    at /usr/include/c++/4.8.2/tr1/shared_ptr.h:141
#20 0x00007f7e0d664111 in ~__shared_count (this=0x7f7dfc917bf8, __in_chrg=<optimized out>)
    at /usr/include/c++/4.8.2/tr1/shared_ptr.h:341
#21 ~__shared_ptr (this=0x7f7dfc917bf0, __in_chrg=<optimized out>) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:541
#22 ~shared_ptr (this=0x7f7dfc917bf0, __in_chrg=<optimized out>) at /usr/include/c++/4.8.2/tr1/shared_ptr.h:985
#23 (anonymous namespace)::BeaconResponseHandler::handleResponse (this=<optimized out>, responseFrom=0x7f7dfc917d90, 
    transport=..., version=<optimized out>, command=<optimized out>, payloadSize=<optimized out>, 
    payloadBuffer=0x55992b986c20) at ../../src/remoteClient/clientContextImpl.cpp:2775
#24 0x00007f7e0d65eb8a in (anonymous namespace)::ClientResponseHandler::handleResponse (this=<optimized out>, 
    responseFrom=<optimized out>, transport=..., version=<optimized out>, command=<optimized out>, 
    payloadSize=<optimized out>, payloadBuffer=0x55992b986c20) at ../../src/remoteClient/clientContextImpl.cpp:3024
#25 0x00007f7e0d63b51f in epics::pvAccess::BlockingUDPTransport::processBuffer (this=this@entry=0x55992b986b00, 
    transport=..., fromAddress=..., receiveBuffer=receiveBuffer@entry=0x55992b986c20)
    at ../../src/remote/blockingUDPTransport.cpp:422
#26 0x00007f7e0d63c58b in epics::pvAccess::BlockingUDPTransport::run (this=0x55992b986b00)
    at ../../src/remote/blockingUDPTransport.cpp:271
#27 0x00007f7e0cfd2949 in epicsThreadCallEntryPoint (pPvt=0x55992b986ec0) at ../../src/osi/epicsThread.cpp:83
#28 0x00007f7e0cfd88dc in start_routine (arg=0x55992b987610) at ../../src/osi/os/posix/osdThread.c:403
#29 0x00007f7e150d0dd5 in start_thread () from /lib64/libpthread.so.0
#30 0x00007f7e14df9ead in tdestroy_recurse () from /lib64/libc.so.6
#31 0x0000000000000000 in ?? ()

Thread 7 (LWP 1264):
#0  0x00005599291d4ccf in address_in_range () at /tmp/build/80754af9/python_1565725737370/work/Objects/obmalloc.c:1365
#1  pymalloc_free.isra.0 (p=0x7f7dfedd6170) at /tmp/build/80754af9/python_1565725737370/work/Objects/obmalloc.c:1635
#2  _PyObject_Free () at /tmp/build/80754af9/python_1565725737370/work/Objects/obmalloc.c:1840
#3  0x00005599291eb7b2 in list_dealloc (op=0x7f7de00c1af0)
    at /tmp/build/80754af9/python_1565725737370/work/Objects/listobject.c:324
#4  0x00005599292ae2ec in _PyEval_EvalFrameDefault ()
    at /tmp/build/80754af9/python_1565725737370/work/Python/ceval.c:1314
#5  0x00007f7dfee43020 in __Pyx_PyFunction_FastCallNoKw (co=co@entry=0x7f7e0a490270, args=<optimized out>, 
    args@entry=0x7f7de1231990, na=na@entry=3, globals=globals@entry=0x7f7e1546e370) at _gw.cpp:9157
#6  0x00007f7dfee4399f in __Pyx_PyFunction_FastCallDict (func=0x7f7dfedd0b00, args=0x7f7de1231990, nargs=3, kwargs=0x0)
    at _gw.cpp:9189
#7  0x00007f7dfee4878f in GWProvider_testChannel (__pyx_v_provider=<optimized out>, __pyx_v_name=<optimized out>, 
    __pyx_v_peer=<optimized out>) at _gw.cpp:6368
#8  0x00007f7dfee5b628 in GWProvider::channelFind (this=0x55992beb2310, name=..., requester=...)
    at ../gwchannel.cpp:1109
#9  0x00007f7e0d693443 in epics::pvAccess::ServerSearchHandler::handleResponse (this=0x55992beb8718, 
    responseFrom=<optimized out>, transport=..., version=<optimized out>, command=<optimized out>, 
    payloadSize=<optimized out>, payloadBuffer=0x55992befa1b0) at ../../src/server/responseHandlers.cpp:361
#10 0x00007f7e0d683b92 in epics::pvAccess::ServerResponseHandler::handleResponse (this=<optimized out>, 
    responseFrom=<optimized out>, transport=..., version=<optimized out>, command=<optimized out>, 
    payloadSize=<optimized out>, payloadBuffer=0x55992befa1b0) at ../../src/server/responseHandlers.cpp:171
#11 0x00007f7e0d63b51f in epics::pvAccess::BlockingUDPTransport::processBuffer (this=this@entry=0x55992befa090, 
    transport=..., fromAddress=..., receiveBuffer=receiveBuffer@entry=0x55992befa1b0)
    at ../../src/remote/blockingUDPTransport.cpp:422
#12 0x00007f7e0d63c58b in epics::pvAccess::BlockingUDPTransport::run (this=0x55992befa090)
    at ../../src/remote/blockingUDPTransport.cpp:271
#13 0x00007f7e0cfd2949 in epicsThreadCallEntryPoint (pPvt=0x55992bf1a420) at ../../src/osi/epicsThread.cpp:83
#14 0x00007f7e0cfd88dc in start_routine (arg=0x55992bf1aba0) at ../../src/osi/os/posix/osdThread.c:403
#15 0x00007f7e150d0dd5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f7e14df9ead in tdestroy_recurse () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()
mdavidsaver commented 5 years ago

The crash appears to happen when two different Beacon receiver threads are processing the PVField payload at the end of Beacon messages.

This payload is a full un-cached serialization, including Structure. This does break an assumption I made in epics-base/pvDataCPP#55 that Structure creation was relatively rare. In fact it happens for every Beacon...

Even so, this should only manifest as a performance hit. This crash suggests that there is also a problem with locking.

mdavidsaver commented 5 years ago

While it is being deserialized, the Beacon payload is currently uninteresting, and unused.

https://github.com/epics-base/pvAccessCPP/blob/866b75a36de52a7126204c0f8a45e7f67ff51a01/src/server/beaconServerStatusProvider.cpp#L30-L34