taurus-org / taurus

Moved to https://gitlab.com/taurus-org/taurus
http://taurus-scada.org
43 stars 46 forks source link

TaurusForm not recovering after tango device restart #1060

Open cpascual opened 4 years ago

cpascual commented 4 years ago

I create this issue to report a bug found by @cmft Not 100% reproducible but most of the times we can reproduce it as follows:

  1. Use taurus develop and tango 9 (tested with 9.2 and 9.3)
  2. enable tango polling for attr sys/tg_test/1/float_scalar
  3. taurus --log-level=Debug form sys/tg_test/1/float_scalar
  4. from astor, restart the Test DS

Expected behaviour: the form may momentarily be greyed and then it should recover. In the console, the traces should show that a connection error occurs, then the polling is activated and then the polling is stopped when the server is up again and events flow again.

Observed behaviour: the form is greyed and it never recovers. polling is never stopped

Notes:

cpascual commented 4 years ago

Here is a way of reproducing from scratch:

TERMINAL1: Prepare a Tango system It can be done with:

docker-compose -f ~/src/taurus/ci/tango_docker-compose.yml up -d

TERMINAL2 launch the [FooBar DS]

wget https://gist.github.com/cpascual/c80dfd07a6220471bd2d8a3ed2ed559d/raw/f2b2b97abb5e0366c662dc5d77fec9a15206831b/FooBar.py
python3 FooBar.py

TERMiNAL3:

Run a taurusform

taurus --log-level=Debug form test/foobar/1/foo

Now, kill the DS (CTRL+C in TERM1) and relaunch it immediately

The form will be frozen and in TERM3 you will get something like :

(py3qt5) carlos@shelob:~$ taurus --log-level=Debug form test/foobar/1/foo
MainThread     DEBUG    2020-02-06 18:15:55,619 TaurusApplication: Failed to load sardana extensions
Traceback (most recent call last):
  File "/home/carlos/src/taurus/lib/taurus/qt/qtgui/application/taurusapplication.py", line 242, in __registerExtensions
    import sardana.taurus.qt.qtcore.tango.sardana
ModuleNotFoundError: No module named 'sardana'
MainThread     DEBUG    2020-02-06 18:15:55,620 TaurusManager.TaurusTP: Starting TaurusTP.W001
MainThread     DEBUG    2020-02-06 18:15:55,620 TaurusManager.TaurusTP: Starting TaurusTP.W002
MainThread     DEBUG    2020-02-06 18:15:55,621 TaurusManager.TaurusTP: Starting TaurusTP.W003
MainThread     DEBUG    2020-02-06 18:15:55,621 TaurusManager.TaurusTP: Starting TaurusTP.W004
MainThread     DEBUG    2020-02-06 18:15:55,621 TaurusManager.TaurusTP: Starting TaurusTP.W005
MainThread     DEBUG    2020-02-06 18:15:55,621 TaurusManager.TaurusTSP: Starting TaurusTSP.W001
MainThread     DEBUG    2020-02-06 18:15:55,622 TaurusManager: Found plugin ResourcesFactory
MainThread     DEBUG    2020-02-06 18:15:55,624 TaurusManager: Found plugin EvaluationFactory
MainThread     DEBUG    2020-02-06 18:15:55,624 TaurusManager: Found plugin TangoFactory
MainThread     DEBUG    2020-02-06 18:15:55,630 TaurusManager: Found plugin EpicsFactory
MainThread     DEBUG    2020-02-06 18:15:55,779 DefaultReadWidgetLabel: Invalid format '{:~.{bc.modelObj.precision}f}' for <Quantity(3, 'dimensionless')>. Using '{0}'
MainThread     DEBUG    2020-02-06 18:15:55,780 DefaultReadWidgetLabel: Invalid format '{:~.{bc.modelObj.precision}f}' for <Quantity(3, 'dimensionless')>. Using '{0}'
Dummy-1        INFO     2020-02-06 18:16:07,724 shelob.mordor:10000.test/foobar/1.foo: Activating polling. Reason: API_EventTimeout
Dummy-1        DEBUG    2020-02-06 18:16:07,725 TaurusPollingTimer[3000].Timer on _pollAttributes: Timer::start()
TimerLoop 1    DEBUG    2020-02-06 18:16:07,726 TaurusPollingTimer[3000].Timer on _pollAttributes: Timer thread starting
Dummy-1        DEBUG    2020-02-06 18:16:07,728 TaurusPollingTimer[3000].Timer on _pollAttributes: Timer::stop()
Thu Feb  6 18:16:08 2020: Tango::ZmqEventConsumer::push_zmq_event() timeout on callback monitor of tango://shelob.mordor:10000/test/foobar/1/foo.idl5_change
Thu Feb  6 18:16:09 2020: Tango::ZmqEventConsumer::push_zmq_event() timeout on callback monitor of tango://shelob.mordor:10000/test/foobar/1/foo.idl5_change
Thu Feb  6 18:16:10 2020: Tango::ZmqEventConsumer::push_zmq_event() timeout on callback monitor of tango://shelob.mordor:10000/test/foobar/1/foo.idl5_change
reszelaz commented 4 years ago

We observed that when the server is restarted, some change events are emitted with name "idl5_change" instead of "change". If we modify TangoAttribute._pushAttrEvent to ignore such events, the problem disapears. This may be related with tango-controls/cppTango#492

Is there something to be done on this? In Sardana we just changed to Device_5Impl and I wonder if this could couse troubles?