CESNET / netopeer2

NETCONF toolset
BSD 3-Clause "New" or "Revised" License
290 stars 186 forks source link

`test_sub_ntf` is unstable #1567

Open jktjkt opened 2 months ago

jktjkt commented 2 months ago

We just got this failure in the CI using this morning's latest devel of everything:

11/17 Test #13: test_sub_ntf .....................Subprocess aborted***Exception:  17.30 sec
[==========] tests: Running 9 test(s).
[ RUN      ] test_invalid_start_time
[       OK ] test_invalid_start_time
[ RUN      ] test_invalid_stop_time
[       OK ] test_invalid_stop_time
[ RUN      ] test_invalid_start_stop_time
[       OK ] test_invalid_start_stop_time
[ RUN      ] test_basic_sub
[       OK ] test_basic_sub
[ RUN      ] test_replay_sub
[       OK ] test_replay_sub
[ RUN      ] test_replay_real_time
"<n1 xmlns="n1">
  <first>First</first>
</n1>
" != "<replay-completed xmlns="urn:ietf:params:xml:ns:yang:ietf-subscribed-notifications">
  <id>3</id>
</replay-completed>
"
[   LINE   ] --- /home/ci/src/cesnet-gerrit-public/CzechLight/dependencies/Netopeer2/tests/test_sub_ntf.c:312: error: Failure!

Two runs succeeded, one run failed. The failing one was using TSAN which imposes a "slightly slower effective CPU speed". Perhaps there's some sleeping code in the test that's too optimistic?

michalvasko commented 2 months ago

I think this is the test that I already saw failing a few times but it is so rare I have not even attempted to fix it and am hoping for a better use-case. There is nothing fishy in the test, a notification should be replayed and is not in this failing case. I do not even see any data race so perhaps there is a corner case with the timestamp of the stored notification file and the start_time used in the subscription? There was one fixed a long time ago. I will try to look for something when I have some spare time.

michalvasko commented 2 weeks ago

Well, the subscriptions were seriously buggy but it did not manifest until I attempted (and succeeded) to make the tests more efficient. I have tried to fix it all so let me know if you still encounter some issues.