kbr / fritzconnection

Python-Tool to communicate with the AVM Fritz!Box by the TR-064 protocol and the AHA-HTTP-Interface
MIT License
303 stars 59 forks source link

test_terminate_thread_on_failed_reconnection and test_restart_failed_monitor flaky #154

Closed pgajdos closed 2 years ago

pgajdos commented 2 years ago

These two tests fails for x86_64, but only sometimes.

[   69s] ________ test_terminate_thread_on_failed_reconnection[data7-5-5-False] _________
[   69s] 
[   69s] data = ['first\n', '', 'second\n'], timeouts = 5, tries = 5, success = False
[   69s] 
[   69s]     @pytest.mark.parametrize(
[   69s]         "data, timeouts, tries, success",
[   69s]         [
[   69s]             (["first\n", "second\n"], 0, 0, True),
[   69s]             (["first\n", "", "second\n"], 1, 0, False),
[   69s]             (["first\n", "", "second\n"], 0, 1, True),
[   69s]             (["first\n", "", "second\n"], 1, 1, False),
[   69s]             (["first\n", "", "second\n"], 1, 2, True),
[   69s]             # default for tries: 5
[   69s]             (["first\n", "", "second\n"], 3, 5, True),
[   69s]             (["first\n", "", "second\n"], 4, 5, True),
[   69s]             (["first\n", "", "second\n"], 5, 5, False),
[   69s]         ],
[   69s]     )
[   69s]     def test_terminate_thread_on_failed_reconnection(data, timeouts, tries, success):
[   69s]         """
[   69s]         Check for thread-termination in case reconnection fails.
[   69s]         """
[   69s]         mock_socket = MockReconnectFailSocket(data, timeouts=timeouts)
[   69s]         fm = FritzMonitor()
[   69s]         fm.start(sock=mock_socket, reconnect_delay=0.001, reconnect_tries=tries)
[   69s]         # give thread some time:
[   69s]         time.sleep(0.01)
[   69s]         if success:
[   69s]             assert fm.is_alive is True
[   69s]         else:
[   69s] >           assert fm.is_alive is False
[   69s] E           assert True is False
[   69s] E            +  where True = <fritzconnection.core.fritzmonitor.FritzMonitor object at 0x7facb1d8f790>.is_alive
[   69s] 
[   69s] data       = ['first\n', '', 'second\n']
[   69s] fm         = <fritzconnection.core.fritzmonitor.FritzMonitor object at 0x7facb1d8f790>
[   69s] mock_socket = <fritzconnection.tests.test_fritzmonitor.MockReconnectFailSocket object at 0x7facb1d8f1f0>
[   69s] success    = False
[   69s] timeouts   = 5
[   69s] tries      = 5
[   69s] 
[   69s] /home/abuild/rpmbuild/BUILD/fritzconnection-1.9.1/fritzconnection/tests/test_fritzmonitor.py:340: AssertionError
[   69s] _________________________ test_restart_failed_monitor __________________________
[   69s] 
[   69s]     def test_restart_failed_monitor():
[   69s]         """
[   69s]         Check whether a fritzmonitor instance with a lost connection can get started again.
[   69s]         Starting the same instance twice does (and should) not work.
[   69s]         See test_start_twice().
[   69s]         But after a failed reconnect (a lost connection) the same instance without calling stop()
[   69s]         """
[   69s]         socket = MockReconnectFailSocket(
[   69s]             mock_data=["first\n", "", "second\n"], timeouts=16
[   69s]         )  # just some timeouts
[   69s]         fm = FritzMonitor()
[   69s]         fm.start(
[   69s]             sock=socket, reconnect_delay=0.001, reconnect_tries=5
[   69s]         )  # set default explicit for clarity
[   69s]         # give socket some time to lose connection:
[   69s]         time.sleep(0.01)
[   69s] >       assert fm.is_alive is False
[   69s] E       assert True is False
[   69s] E        +  where True = <fritzconnection.core.fritzmonitor.FritzMonitor object at 0x7facb1dc46a0>.is_alive
[   69s] 
[   69s] fm         = <fritzconnection.core.fritzmonitor.FritzMonitor object at 0x7facb1dc46a0>
[   69s] socket     = <fritzconnection.tests.test_fritzmonitor.MockReconnectFailSocket object at 0x7facb1dc45e0>
[   69s] 
[   69s] /home/abuild/rpmbuild/BUILD/fritzconnection-1.9.1/fritzconnection/tests/test_fritzmonitor.py:361: AssertionError
[   69s] =========================== short test summary info ============================
[   69s] FAILED fritzconnection/tests/test_fritzmonitor.py::test_terminate_thread_on_failed_reconnection[data7-5-5-False]
[   69s] FAILED fritzconnection/tests/test_fritzmonitor.py::test_restart_failed_monitor
[   69s] ======================== 2 failed, 215 passed in 3.16s =========================
kbr commented 2 years ago

If the test fails just from time to time, it's unclear what really happens. Testing threads is hard by the way. It may be possible that the background system load is too high for the given sleep-time. But that's guessing, so more information is needed. Can you give some information about the system and can you reproduce the error if nothing else than the OS and the test is running?

pgajdos commented 2 years ago

Thanks for the reply.

Look at following build log: https://build.opensuse.org/package/live_build_log/home:pgajdos:python/python-fritzconnection/openSUSE_Tumbleweed/x86_64

You can search for free and/or cpuinfo. If you need more info, tell me.

I understand this is marginal issue, even on buildservice it is hard to reproduce. I can imagine it can for example happen, when whole project with python rebuilds. So if you do not want to dig in it, I will just leave these two tests disabled to have the build reliable.

kbr commented 2 years ago

Makes no sense to make a lot of afford to dig into this. Actually best guess is the sleep-time in combination with heavy system load. The 0.01 seconds is an arbitrary value working on my system (sic.). I can change that to, say 0.05 in the next release. May be the error will not pop up again.

pgajdos commented 2 years ago

I think let it be for now. Thanks!