Closed ankit7gup closed 1 year ago
What sysrepo version are you using? Is it a debug build or release? In any case, this should not happen and is a bug. Can you post preceding output as well? It would be ideal if you know how to reproduce it and provide the steps.
Hi Michal, We are using the master latest release with patch for Libyan https://github.com/CESNET/libyang/commit/3ff82dd469432b0d55c348b31ca8575e59b4d5a2.
Please find attached the complete log file for server. syslog-netopeer_logs.txt
This behavior is random. Not observed everytime.
The log has shown the problem quite clearly, try the latest sysrepo devel
, it should be fixed.
@michalvasko Can you please share the commit if we can apply that as a patch if possible. Thanks!
It is seen right after my previous post, I have referenced this issue directly.
It is seen right after my previous post; I have referenced this issue directly. I hope this is the one. https://github.com/sysrepo/sysrepo/commit/664cdb412841b55517d8039aae0dcc78693b4de8
Yes, the same link is literally a few posts above.
Hi Michal,
Now the previous invalid value is not coming, but I am getting below errors.
2023-05-30 06:55:12.004581 PUMA info netopeer2-server[2310]: EV LISTEN: "/ietf-netconf:get" "rpc" ID 10 priority 0 success (remaining 0 subscribers). 2023-05-30 06:55:12.004678 PUMA info netopeer2-server[2310]: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 10 priority 0 succeeded. 2023-05-30 06:55:12.006935 PUMA info netopeer2-server[2310]: Session 1: thread 1 event new RPC. 2023-05-30 06:55:12.519897 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-sync" "notif" ID 1 processing. 2023-05-30 06:55:12.520903 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-sync" "notif" ID 1 priority 0 success (remaining 0 subscribers). 2023-05-30 06:55:12.533122 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-sync" "notif" ID 2 processing. 2023-05-30 06:55:12.534368 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-sync" "notif" ID 2 priority 0 success (remaining 0 subscribers). 2023-05-30 06:55:12.535335 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-fm" "notif" ID 1 processing. 2023-05-30 06:55:12.536381 PUMA info netopeer2-server[2310]: EV LISTEN: "o-ran-fm" "notif" ID 1 priority 0 success (remaining 0 subscribers). 2023-05-30 06:55:13.050644 PUMA info netopeer2-server[2310]: EV LISTEN: "ietf-hardware" "notif" ID 1 processing. 2023-05-30 06:55:13.051967 PUMA info netopeer2-server[2310]: EV LISTEN: "ietf-hardware" "notif" ID 1 priority 0 success (remaining 0 subscribers). 2023-05-30 06:55:15.750991 PUMA info netopeer2-server[2310]: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 11 priority 0 for 1 subscribers published. 2023-05-30 06:55:15.751408 PUMA info netopeer2-server[2310]: EV LISTEN: "/ietf-netconf:get" "rpc" ID 11 priority 0 processing (remaining 1 subscribers). . . . 2023-05-30 06:55:15.958661 PUMA info netopeer2-server[2310]: EV ORIGIN: "o-ran-processing-element" "done" ID 1 priority 0 succeeded. 2023-05-30 06:55:25.959823 PUMA err netopeer2-server[2310]: Waiting for subscription of "ietf-netconf-notifications" failed, previous event "notif" ID 2 was not processed. 2023-05-30 06:55:25.960015 PUMA err netopeer2-server[2310]: Failed to generate netconf-config-change notification, but changes were applied. 2023-05-30 06:55:25.960415 PUMA info netopeer2-server[2310]: EV LISTEN: "/ietf-netconf:edit-config" "rpc" ID 2 priority 0 fail (remaining 1 subscribers). 2023-05-30 06:55:25.960834 PUMA err netopeer2-server[2310]: Waiting for subscription of "ietf-netconf-notifications" failed, previous event "notif" ID 2 was not processed.
Please suggest.
Thanks!
You can again post the whole output but it may simply be that your callback handling the notification is stuck, it then breaks the notifications for the particular module.
You can again post the whole output but it may simply be that your callback handling the notification is stuck, it then breaks the notifications for the particular module.
Please find the complete o/p file attached. netopeer`.txt
It is not the whole output, I can see the message
2023-05-29 21:43:18.800970 err netopeer2-server[2312]: Waiting for subscription of "ietf-netconf-notifications" failed, previous event "notif" ID 5 was not processed.
on line 249 but there is no message informing about this event being generated.
EDIT: Okay, with the other "notif" event ID 2 the messages are there.
1970-01-01 00:18:38.648131 info netopeer2-server[2310]: EV LISTEN: "/notifications:create-subscription" "rpc" ID 4 priority 0 processing (remaining 1 subscribers).
1970-01-01 00:18:38.649335 info netopeer2-server[2310]: EV ORIGIN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 succeeded.
1970-01-01 00:18:38.649996 info netopeer2-server[2310]: Session 4: thread 0 event new RPC.
1970-01-01 00:18:38.704075 info netopeer2-server[2310]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 2 processing.
1970-01-01 00:18:38.704393 info netopeer2-server[2310]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 2 priority 0 success (remaining 4294967295 subscribers).
1970-01-01 00:18:38.869142 info netopeer2-server[2310]: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 7 priority 0 for 1 subscribers published.
1970-01-01 00:18:39.060677 info netopeer2-server[2310]: EV LISTEN: "/notifications:create-subscription" "rpc" ID 4 priority 0 success (remaining 0 subscribers).
1970-01-01 00:18:39.060789 info netopeer2-server[2310]: EV ORIGIN: "/notifications:create-subscription" "rpc" ID 4 priority 0 succeeded.
and it seems exactly the same as before, are you certain you have the patch applied? Actually, the whole output is weird, there is a message printed when event ID 1 is published
1970-01-01 00:18:38.643754 info netopeer2-server[2310]: EV ORIGIN: "ietf-netconf-notifications" "notif" ID 1 priority 0 for 2 subscribers published.
but no message for ID 2? Makes no sense.
Here, Also, I see same error as previosuly even though we have the patch applied.
1970-01-01 00:18:36.781313 PUMA info netopeer2-server[2312]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 5 processing. 1970-01-01 00:18:36.781651 PUMA info netopeer2-server[2312]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 5 priority 0 success (remaining 4294967295 subscribers).
I still do not understand how you could get a log missing some messages but I think there were some problems that could have caused the same problem as before. So try to use commits https://github.com/sysrepo/sysrepo/commit/5e08c3511d1d2c704a7bde8cf58293f00e770426 and https://github.com/sysrepo/sysrepo/commit/d784aa94848af0fa1e0ef80df9488062fe9114d6 as well.
Hi @michalvasko, Now invalid values are not seen, but still RPC fails due to similar reason. Sharing complete log file.
Please check. Thanks! netopeer.txt
Hi Michal, Now invalid values are not seen, but still RPC fails due to similar reason. Sharing complete log file.
Please check. Thanks! netopeer.txt
One more thing I observed from the logs is that before the error starts clock time in logs is mixed up.
2023-05-31 13:16:26.134391 info netopeer2-server[2319]: EV ORIGIN: "ietf-hardware" "change" ID 2 priority 0 succeeded. 2023-05-31 13:16:26.134440 info netopeer2-server[2319]: EV ORIGIN: "ietf-interfaces" "change" ID 1 priority 0 succeeded. 2023-05-31 13:16:26.134467 info netopeer2-server[2319]: EV ORIGIN: "o-ran-performance-management" "change" ID 1 priority 0 succeeded. 2023-05-31 13:16:26.157657 info netopeer2-server[2319]: EV ORIGIN: "ietf-hardware" "done" ID 2 priority 0 for 1 subscribers published. 2023-05-31 13:16:26.158310 info netopeer2-server[2319]: EV ORIGIN: "ietf-interfaces" "done" ID 1 priority 0 for 1 subscribers published. 2023-05-31 13:16:26.158741 info netopeer2-server[2319]: EV ORIGIN: "o-ran-performance-management" "done" ID 1 priority 0 for 1 subscribers published. 2023-05-31 13:15:49.294948 info netopeer2-server[2319]: EV ORIGIN: "ietf-hardware" "done" ID 2 priority 0 succeeded. 2023-05-31 13:15:49.294998 info netopeer2-server[2319]: EV ORIGIN: "ietf-interfaces" "done" ID 1 priority 0 succeeded. 2023-05-31 13:15:49.295025 info netopeer2-server[2319]: EV ORIGIN: "o-ran-performance-management" "done" ID 1 priority 0 succeeded. 2023-05-31 13:15:49.309744 info netopeer2-server[2319]: EV ORIGIN: "ietf-netconf-notifications" "notif" ID 1 priority 0 for 1 subscribers published. 2023-05-31 13:15:49.335614 info netopeer2-server[2319]: EV LISTEN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 success (remaining 0 subscribers). 2023-05-31 13:15:49.335989 info netopeer2-server[2319]: EV ORIGIN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 succeeded. 2023-05-31 13:15:49.336842 info netopeer2-server[2319]: Session 1: thread 2 event new RPC. 2023-05-31 13:15:49.337792 info netopeer2-server[2319]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 1 processing. 2023-05-31 13:15:49.338187 info netopeer2-server[2319]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 1 ignored, created after the notification. 2023-05-31 13:15:49.338311 info netopeer2-server[2319]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 1 priority 0 success (remaining 1 subscribers).
Hi @michalvasko,
Any update on this issue. Thanks!
Regards, Ankit
Please be patient, I cannot always immediately react to every issue, I have been fixing another one. But you seem to be right about the clocks
2023-05-31 13:16:26.158741 info netopeer2-server[2319]: EV ORIGIN: "o-ran-performance-management" "done" ID 1 priority 0 for 1 subscribers published.
2023-05-31 13:15:49.294948 info netopeer2-server[2319]: EV ORIGIN: "ietf-hardware" "done" ID 2 priority 0 succeeded.
...
2023-05-31 13:15:49.978573 info netopeer2-server[2319]: Session 3: thread 1 event new RPC.
2023-05-31 13:16:27.063334 info netopeer2-server[2319]: Session 2: Received an SSH message "request-channel-open" of subtype "session".
This is most weird and seems broken, for almost a second your system time has been moved back 37s. I have used real time timestamps for avoiding a data race before that resulted in a previous error and this behavior breaks it. I will try to somehow avoid completely breaking sysrepo subscriptions like this but notifications are based on real-time so if you do some wild jumps like this, they are not going to work correctly no matter what.
Please be patient, I cannot always immediately react to every issue, I have been fixing another one. But you seem to be right about the clocks
2023-05-31 13:16:26.158741 info netopeer2-server[2319]: EV ORIGIN: "o-ran-performance-management" "done" ID 1 priority 0 for 1 subscribers published. 2023-05-31 13:15:49.294948 info netopeer2-server[2319]: EV ORIGIN: "ietf-hardware" "done" ID 2 priority 0 succeeded. ... 2023-05-31 13:15:49.978573 info netopeer2-server[2319]: Session 3: thread 1 event new RPC. 2023-05-31 13:16:27.063334 info netopeer2-server[2319]: Session 2: Received an SSH message "request-channel-open" of subtype "session".
This is most weird and seems broken, for almost a second your system time has been moved back 37s. I have used real time timestamps for avoiding a data race before that resulted in a previous error and this behavior breaks it. I will try to somehow avoid completely breaking sysrepo subscriptions like this but notifications are based on real-time so if you do some wild jumps like this, they are not going to work correctly no matter what.
Thanks and sorry for rushing. It is quite critical for us, thus asked.
I have not really come up with any simple solution so some non-trivial mechanism would have to be added which I am not too keen on implementing. So are these time jumps something common and intentional in your setup, is there some valid reason for them? As long as they do not occur, it should work fine.
This is CLOCK_REALTIME
, right? One use case which might still break it is NTP synchronization and a host which does not have RTC clock hardware (think embedded with no RTC battery). On such HW, it's common to see a big jump in time once the NTP client has synced the time, and that can only happen after the network has been configured, and that might need an app which uses sysrepo to look up the network settings -- hence the problem.
(Fortunately, it doesn't affect our boxes because they have the RTC clock and battery, except on a first boot, and because we pregenerate network config for systemd-networkd
.)
There are lots of preconditions for this problem to occur so it is much less common than it may seem. I am aware of NTP time jumps but in this particular case the time has soon jumped back so it really seems like an error. But the core of the matter remains the same, notifications are designed to use real time timestamps as a critical piece of information with some natural assumptions as to their behavior. If your device is liable to break them you should probably not be using them.
@jktjkt In the use-case you mentioned, it should not be too difficult to set it all up so that only after NTP is running notification subscriptions are made, for example. Because they do, in fact, have a strict dependency on correct real time on your system.
I have not really come up with any simple solution so some non-trivial mechanism would have to be added which I am not too keen on implementing. So are these time jumps something common and intentional in your setup, is there some valid reason for them? As long as they do not occur, it should work fine.
These jumps were due to some error/bug with PTP clock. But One time jump still would be a common scenario for us i.e. from the system time changes once while netopeer2-server is still running to sync with master clock. But not again.
So I have implemented a fairly simple but messy solution to this, the check is now based on monotonic clock so it should not be affected by these problems. But what I said earlier is definitely true at least when using notification replay, unlucky realtime jumps would cause chaos.
Thanks! Will try this solution.
Hi @michalvasko,
We get below error while edit-config operation and pending notification processing error is there, while going back, I saw a very random pending subscriber no. Please suggest.
2023-05-23 04:36:35.325712 info netopeer2-server[2048]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 4 processing. 2023-05-23 04:36:35.326016 info netopeer2-server[2048]: EV LISTEN: "ietf-netconf-notifications" "notif" ID 4 priority 0 success (remaining 4294967295 subscribers).
2023-05-23 04:36:48.612180 info netopeer2-server[2048]: EV ORIGIN: "o-ran-processing-element" "done" ID 1 priority 0 for 1 subscribers published. 2023-05-23 04:36:48.629946 info netopeer2-ser 2023-05-23 04:48:33.483751 err netopeer2-server[2048]: Waiting for subscription of "ietf-netconf-notifications" failed, previous event "notif" ID 4 was not processed. 2023-05-23 04:48:33.483940 err netopeer2-server[2048]: Failed to generate netconf-config-change notification, but changes were applied. 2023-05-23 04:48:33.484394 info netopeer2-server[2048]: EV LISTEN: "/ietf-netconf:edit-config" "rpc" ID 2 priority 0 fail (remaining 1 subscribers). 2023-05-23 04:48:33.484501 err netopeer2-server[2048]: Waiting for subscription of "ietf-netconf-notifications" failed, previous event "notif" ID 4 was not processed. 2023-05-23 04:48:33.484532 warning netopeer2-server[2048]: EV ORIGIN: "/ietf-netconf:edit-config" "rpc" ID 2 priority 0 failed (Timeout expired). 2023-05-23 04:48:33.485004 err netopeer2-server[2048]: User callback failed. 2023-05-23 04:48:33.485038 err netopeer2-server[2048]: Failed to send an RPC (User callback failed).
Thanks!