ChannelFinder / recsync

EPICS Record Synchronizor
Other
15 stars 25 forks source link

RecCeiver Service Unhandled Error #20

Open dxmaxwell opened 7 years ago

dxmaxwell commented 7 years ago

The RecCeiver service crashes after a few days of continuous operation with the following exception:

2016-12-30 03:51:07-0500 [-] Unhandled Error
Traceback (most recent call last):
 File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 392, in startReactor
   self.config, oldstdout, oldstderr, self.profiler, reactor)
 File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 313, in runReactorWithLogging
   reactor.run()
 File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1192, in run
   self.mainLoop()
 File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop
   self.runUntilCurrent()
--- <exception caught here> ---
 File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 797, in runUntilCurrent
   f(*a, **kw)
 File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 382, in callback
   self._startRunCallbacks(result)
 File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 483, in _startRunCallbacks
   raise AlreadyCalledError
twisted.internet.defer.AlreadyCalledError: 

@dchabot @berryma4

dchabot commented 7 years ago

@shroffk: can you run this with some debugging activated to get a more useful trace?

There are only a couple of places in the code base using Deferreds - shutdown (maybe a Processor threw an error?) and DB commits...

shroffk commented 7 years ago

I have not been able to reproduce the crash. It at times takes days to get the crash...enabling the defaul debuggin produces a huge amount of logs, I will look into adding some smart filters and retry this.

dchabot commented 7 years ago

If this error is due to a race condition, activating debugging may alter the timing, perhaps inadvertently hiding the problem(s).

Not sure what the remedy is for that...