psi46 / pxar

Life is too short for perfection
16 stars 46 forks source link

Cleanly Disconnect from DTB Upon Keyboard Interrupt #387

Open cfangmeier opened 9 years ago

cfangmeier commented 9 years ago

The behavior that I have observed with pXar is that if one issues a keyboard interrupt(ctrl-C), pXar will quit without properly closing the connection to the DTB. The is a problem because it leaves the DTB unresponsive until it is power cycled. This is especially problematic if the DTB is in a remote location.

It would be preferable to catch the exception and close the connection before exiting. The Python "cmdline" seems to already be doing this.

simonspa commented 9 years ago

Adding signal handler deleting the API object should be fine: https://stackoverflow.com/questions/1641182/how-can-i-catch-a-ctrl-c-event-c

simonspa commented 9 years ago

Some testing might be required to get this compiling and working on all three platforms, WIN32 is not POSIX conform.

ursl commented 9 years ago

Hi Caleb,

in my experience this varies from setup to setup. On the ubuntu box I can hit C-C and restart without problems in >90% of all cases. On my mac I have to unplug the USB cable...

I'll add the sigaction for *nix.

Cheers, --U.

On Wed, Aug 5, 2015 at 8:57 PM, Caleb Fangmeier notifications@github.com wrote:

The behavior that I have observed with pXar is that if one issues a keyboard interrupt(ctrl-C), pXar will quit without properly closing the connection to the DTB. The is a problem because it leaves the DTB unresponsive until it is power cycled. This is especially problematic if the DTB is in a remote location.

It would be preferable to catch the exception and close the connection before exiting. The Python "cmdline" seems to already be doing this.

— Reply to this email directly or view it on GitHub https://github.com/psi46/pxar/issues/387.

cfangmeier commented 9 years ago

Hi Urs,

Thanks! FYI, I'm currently using pXar on OSX where I am experiencing this problem. We also have an Ubuntu setup. I don't recall if we observed this issue there or not.

ursl commented 9 years ago

Hi Caleb,

it's not clear to me this is easy. Especially if the DTB is doing something when you hit C-c. At least on osx I get

libc++abi.dylib: terminate called throwing an exception

when trying to delete pxarCore. The stack trace looks as follows:

9   libpxar.dylib                 0x000000010491bc91 CTestboard::HVoff() +
545 (rpc_calls.cpp:981)
10  libpxar.dylib                 0x00000001048e913d pxar::hal::~hal() + 29
(hal.cc:94)
11  libpxar.dylib                 0x0000000104897713
pxar::pxarCore::~pxarCore() + 51 (api.cc:40)
12  pXar                           0x00000001048850e3 sig_handler(int) +
259 (pXar.cc:415)
13  libsystem_c.dylib             0x00007fff8bb9e90a _sigtramp + 26
14  libsystem_c.dylib             0x00007fff8bbd8365 __error + 1
15  libsystem_c.dylib             0x00007fff8bbb809f cthread_set_errno_self
+ 20
16  libsystem_kernel.dylib         0x00007fff8b9c04fc cerror_nocancel + 40
17  libsystem_c.dylib             0x00007fff8bc3a6df usleep + 54
18  libftd2xx.1.2.2.dylib         0x0000000107d92da4 FT_Read + 250
19  libpxar.dylib                 0x000000010493191c
CUSB::FillBuffer(unsigned int) + 140 (USBInterface.libftd2xx.cc:264)
20  libpxar.dylib                 0x0000000104931f51 CUSB::Read(unsigned
int, void*, unsigned int&) + 417 (USBInterface.libftd2xx.cc:301)
21  libpxar.dylib                 0x00000001049324a9 CUSB::Read(void*,
unsigned int) + 25 (USBInterface.h:84)
22  libpxar.dylib                 0x000000010492fc75
rpcMessage::Receive(CRpcIo&) + 37 (rpc.cpp:29)
23  libpxar.dylib                 0x000000010492a12b
CTestboard::LoopSingleRocAllPixelsCalibrate(unsigned char, unsigned short,
unsigned short) + 603 (rpc.h:118)
24  libpxar.dylib                 0x00000001048f79b7
pxar::hal::SingleRocAllPixelsCalibrate(unsigned char, std::vector<int,
std::allocator<int> >) + 1287 (hal.cc:892)
25  libpxar.dylib                 0x00000001048a8f6b
pxar::pxarCore::expandLoop(std::vector<pxar::Event*,
std::allocator<pxar::Event*> > (pxar::hal::*)(unsigned char, unsigned char,
unsigned char, std::vector<int, std::allocator<int> >),
std::vector<pxar::Event*, std::allocator<pxar::Event*> >
(pxar::hal::*)(std::vector<unsigned char, std::allocator<unsigned char> >,
unsigned char, unsigned char, std::vector<int, std::allocator<int> >),
std::vector<pxar::Event*, std::allocator<pxar::Event*> >
(pxar::hal::*)(unsigned char, std::vector<int, std::allocator<int> >),
std::vector<pxar::Event*, std::allocator<pxar::Event*> >
(pxar::hal::*)(std::vector<unsigned char, std::allocator<unsigned char> >,
std::vector<int, std::allocator<int> >), std::vector<int,
std::allocator<int> >, unsigned short) + 5003 (stl_vector.h:123)
26  libpxar.dylib                 0x00000001048b39f2
pxar::pxarCore::getEfficiencyMap(unsigned short, unsigned short) + 482
(stl_vector.h:123)
27  libpxartests.dylib             0x0000000107dd00b6
PixTest::efficiencyMaps(std::string, unsigned short, unsigned short) + 614
(PixTest.cc:373)
28  libpxartests.dylib             0x0000000107e12486
PixTestAlive::aliveTest() + 486 (PixTestAlive.cc:148)
29  libpxartests.dylib             0x0000000107e108cc
PixTestAlive::runCommand(std::string) + 668 (PixTestAlive.cc:63)
30  libpxargui.dylib               0x00000001081adfba
PixTab::buttonClicked() + 506 (basic_string.h:279)

Cheers, --U.

simonspa commented 9 years ago

The problem here is, that pxarCore is not thread safe - meaning, if there is something running, you must not call another function or delete the object.

In some other code interfacing pxarCore I added a mutex to prevent this from happening:

{
     // Acquire lock for pxarCore object access:
    std::lock_guard<std::mutex> lck(m_mutex);

    try {
        pxar::rawEvent daqEvent = m_api->daqGetRawEvent();
    } catch (pxar::DataNoEvent &) { }
}

from here: https://github.com/eudaq/eudaq/blob/v1.5-dev/producers/cmspixel/src/CMSPixelProducer.cxx

However, lock_guard mutexes require C++11.

ursl commented 9 years ago

To be honest, I have no intentions to make pxar thread safe at this point in time.

Cheers, --U.

On Fri, Aug 7, 2015 at 5:26 PM, simonspa notifications@github.com wrote:

The problem here is, that pxarCore is not thread safe - meaning, if there is something running, you must not call another function or delete the object.

In some other code interfacing pxarCore I added a mutex to prevent this from happening:

{ // Acquire lock for pxarCore object access: std::lock_guardstd::mutex lck(m_mutex);

try {
    pxar::rawEvent daqEvent = m_api->daqGetRawEvent();
} catch (pxar::DataNoEvent &) { }

}

from here: https://github.com/eudaq/eudaq/blob/v1.5-dev/producers/cmspixel/src/CMSPixelProducer.cxx

However, lock_guard mutexes require C++11.

— Reply to this email directly or view it on GitHub https://github.com/psi46/pxar/issues/387#issuecomment-128733645.

cfangmeier commented 9 years ago

Perhaps it would be possible to simply set some global flag from within the interrupt routine. The tests would then periodically check this flag and exit if it has been set. Is something similar already being done with the "stop" button in the GUI?

Additionally, it may be worth considering to leave ctrl-C as a way to force kill pXar, and setup some other interrupt as a "soft" kill. Escape maybe?