akarneliuk / pygnmi

The pure Python implementation of the gNMI client.
https://training.karneliuk.com
BSD 3-Clause "New" or "Revised" License
127 stars 44 forks source link

gNMI inactivity timeout? #45

Closed andsouth44 closed 2 years ago

andsouth44 commented 2 years ago

Hi Anton, Are you aware of any inactivity timeouts in the code? I have a gNMI subscription that can go for hours or days without producing a message and it appears disconnect/timeout after 2 hours. See below.

This is a log from the Juniper router. It seems to send its last message at 20:26:22 and then receives a "call cancellation" at 22:26:42 (almost exactly 2 hours) from the gNMI client.

This means that when the next message has to be sent from the Juniper, there is no gNMI subscription in place and the message is lost.

Sep 16 20:26:22 ProcessInPendingCall: Pushing tag for RPC: /gnmi.gNMI/Subscribe with tag id: request_1239761477into input pending call queue Sep 16 22:26:42 ProcessInPendingCall: Processing pending call for RPC: /gnmi.gNMI/Subscribe with tag id: request_1239761477 Sep 16 22:26:42 ProcessInPendingCall: Received call cancellation for RPC: /gnmi.gNMI/Subscribe Sep 16 22:26:42 DelCallTag: Tag id request_1239761477 is removed from tag map for client nso Sep 16 22:26:42 ProcessCall: Removing the cancelled call details for tag: request_261625440 Sep 16 22:26:42 ~GrpcCall:Invoking destructor Sep 16 22:26:42 DecrRefCount: rpc request is /gnmi.gNMI/Subscribe Sep 16 22:26:42 DecrRefCount: Reducing Ref count for Package gnmi Sep 16 22:26:42 ~GrpcToJapi:Invoking destrutor Sep 16 22:26:42 DelCallTag: Tag id request_261625440 is removed from tag map for client nso Sep 16 22:26:42 GetPeerConnInfo: getpeername for client fd 15 failed with error: Bad file descriptor Sep 16 22:26:42 IsClientConnected: getpeername for the client ipv6:::ffff:xxxxxxxxx failed with error: Bad file descriptor Sep 16 22:26:42 ~GrpcClientManager: Invoking destructor for client nso Sep 16 22:26:42 ~GrpcClientManager:Termination sent to handler thread Sep 16 22:26:42 DeleteChannel: Deleting entries in tag map for client ipv6:::ffff:xxxxxx:26647 Sep 16 22:26:42 ~GrpcClientManager: Cleaning out the channel map for client nso Sep 16 22:26:42 GrpcHandleCall:Grpc handler cq is shutdown Sep 16 22:26:42 DeleteChannel: Tag map is empty for client ipv6:::ffff:xxxxxx:26647 Sep 16 22:26:42 GrpcHandlerThreadCleanup:Executing grpc handler thread exit handler Sep 16 22:26:42 DestroyClientInfo: peer ipv6:::ffff:xxxxxxx:26647 deleted from client map Sep 16 22:26:45 jsd_config_read:Reading configuration... Sep 16 22:26:45 readGrpcConfig:Inside ConfigManager::readGrpcConfig Sep 16 22:26:45 readGrpcConfig:grpc ssl server knob is configured Sep 16 22:26:45 readGrpcConfig: No change in certificate as crc before(32075) and after(32075) match Sep 16 22:26:45 readGrpcConfig:mutual-authentication knob is configured Sep 16 22:26:45 readGrpcConfig:request-response grpc is unchanged Sep 16 22:26:45 readConfig:Notification service is not configured Sep 16 22:26:45 readConfig:Notification service is not configured Sep 16 22:26:45 jsd_sensor_config_read:jsd_sensor_config_read Invoked. Sep 16 22:26:45 jsd_config_read_streaming_servers:No streaming servers available, delete all old servers. Sep 16 22:26:45 jsd_config_read_export_profiles:No export profiles to read.. Sep 16 22:26:45 jsd_config_read_sensors:No sensors avaialble to read, delete all old sensors. Sep 16 22:26:45 jsd_config_read_sensors:DAX walk failed for jsd_config_single_sensor.

akarneliuk commented 2 years ago

Hello @andsouth44 ,

Thanks for heads up. I must admit, we never tested having the gNMI session to stay open for so long period of time without any need. Could you please describe your use case, sir? I tend to thing, that there are some timers, which we can tweak if needed.

In that regard, there will be some modifications in terms of inactivity timeout which @jbemmel has recently made, and which will be in a new release we will publish shortly.

Best, Anton

andsouth44 commented 2 years ago

Hi Anton, I want to be able to detect and send a gNMI message when interfaces become operationally active on routers. I then use the message to invoke automation work flows. The interfaces may become active at any time and there may be very long quiet periods between events. Therefore I need the gNMI subscription to stay open indefinitely and I need a notification if the subscription is closed for any reason. Maybe you can point me in the direction of the timers you think need to be tweaked and I can work on an enhancement myself? Also, very interested to see the release - do you have a date?

Thanks Andy

jbemmel commented 2 years ago

Hi Anton,

The timeout that is modified in my patch only applies to the initial connect(), it does not affect this long idle connection use case.

We probably want to look at https://github.com/grpc/grpc/blob/master/doc/keepalive.md for this (defaults to 2 hours for the server side)

Regards, Jeroen

On Sat, Oct 2, 2021 at 3:05 PM Andrew Southard @.***> wrote:

Hi Anton, I want to be able to detect and send a gNMI message when interfaces become operationally active on routers. I then use the message to invoke automation work flows. The interfaces may become active at any time and there may be very long quiet periods between events. Therefore I need the gNMI subscription to stay open indefinitely and I need a notification if the subscription is closed for any reason. Maybe you can point me in the direction of the timers you think need to be tweaked and I can work on an enhancement myself? Also, very interested to see the release - do you have a date?

Thanks Andy

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-932813444, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQAC7AMH3A2UL6D5K4FI3UE5QW5ANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

akarneliuk commented 2 years ago

Ah, gotcha. So, there are two different things, then. Here we are talking about timeout of idle timeout for subscribe. Thinking out loud, there shall be a heartbit in your gnmi subscription. Could you please share your message, sir?

andsouth44 commented 2 years ago

Thanks for the responses guys! I have tried setting the heartbeat_interval in my gNMI subscriptions but I get a "not supported" error from the server - I confirmed neither Cisco or Juniper currently support the heartbeat_interval feature for gNMI. So, this takes me back to my original question about inactivity timeouts; assuming I can't set the heartbeat_interval, then maybe I can extend or disable the timeout? However, from reading the links provided in this string it looks like the timeout I am running into is a TCP timeout. Is that correct?

akarneliuk commented 2 years ago

Hello @andsouth44 ,

My understanding of the info from shared link is that GNMI server (your Juniper router) and client (in this case pygnmi), shall be in agreement in terms of keep alive. By it looks, the specification suggests that there shall be no data without calls.

I think, ultimately, you can implement keepalive logic yourself in your script, by sending occasionally (e.g., once per hour) some Get request or Poll message for subscribe.

Best, Anton

andsouth44 commented 2 years ago

Hi Anton, Thanks for the message. Yes, I could try to add a keepalive to my script but it feels like a bit of a hack. The heartbeat feature or something similar would still be my first choice.

BTW - My understanding is that the gNMI heartbeat feature will soon be deprecated which will mean that responsibility for keeping connections alive will be with the client. Have you heard this too? Do you have any plans to enhance pygnmi accordingly?

Thanks Andrew

On Sat, Oct 9, 2021 at 7:18 AM Anton Karneliuk @.***> wrote:

Hello @andsouth44 https://github.com/andsouth44 ,

My understanding of the info from shared link is that GNMI server (your Juniper router) and client (in this case pygnmi), shall be in agreement in terms of keep alive. By it looks, the specification suggests that there shall be no data without calls.

I think, ultimately, you can implement keepalive logic yourself in your script, by sending occasionally (e.g., once per hour) some Get request or Poll message for subscribe.

Best, Anton

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-939296022, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJBL7QQK3DPORMTLNX5PMLUGA6KJANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

akarneliuk commented 2 years ago

Hello @andsouth44 ,

I think, that is a fundamental question. So far, the idea behind pygnmi is to be a pure python implementation of gNMI API. What you mentions, stays outside of gNMI API; it is rather is an application logic, which you need to build on top of the GNMI API. As such, we'll take a look on your news you mentioned and how it is related to the pygnmi core purpose.

Best, Anton

andsouth44 commented 2 years ago

Hi Anton, ok, I understand what you are saying about the scope of pygnmi - that makes sense. I will try and add something to my application logic as you suggest.

Thanks Andy

On Sat, Oct 23, 2021 at 2:57 PM Anton Karneliuk @.***> wrote:

Hello @andsouth44 https://github.com/andsouth44 ,

I think, that is a fundamental question. So far, the idea behind pygnmi is to be a pure python implementation of gNMI API. What you mentions, stays outside of gNMI API; it is rather is an application logic, which you need to build on top of the GNMI API. As such, we'll take a look on your news you mentioned and how it is related to the pygnmi core purpose.

Best, Anton

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-950214193, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJBL7QUGNTTODDV34KE6PDUIMOR5ANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jbemmel commented 2 years ago

Just a note: I found https://github.com/grpc/grpc/issues/15260

An example of gRPC settings you can tune to enable keep-alives for idle connections. pygnmi does "support" these parameters in the sense that they are forwarded transparently to the underlying grpc stack.

It will still depend on what the server side supports, but there are some options to explore

Regards, Jeroen

On Wed, Oct 27, 2021 at 5:10 PM Andrew Southard @.***> wrote:

Hi Anton, ok, I understand what you are saying about the scope of pygnmi - that makes sense. I will try and add something to my application logic as you suggest.

Thanks Andy

On Sat, Oct 23, 2021 at 2:57 PM Anton Karneliuk @.***> wrote:

Hello @andsouth44 https://github.com/andsouth44 ,

I think, that is a fundamental question. So far, the idea behind pygnmi is to be a pure python implementation of gNMI API. What you mentions, stays outside of gNMI API; it is rather is an application logic, which you need to build on top of the GNMI API. As such, we'll take a look on your news you mentioned and how it is related to the pygnmi core purpose.

Best, Anton

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-950214193, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADJBL7QUGNTTODDV34KE6PDUIMOR5ANCNFSM5FCVGU6A

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-953348153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQACZMDTM6KSWWMW2QGE3UJB2FBANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jbemmel commented 2 years ago

Sent a PR: https://github.com/akarneliuk/pygnmi/pull/49 (untested)

Jeroen

On Thu, Oct 28, 2021 at 10:23 AM Jeroen van Bemmel @.***> wrote:

Just a note: I found https://github.com/grpc/grpc/issues/15260

An example of gRPC settings you can tune to enable keep-alives for idle connections. pygnmi does "support" these parameters in the sense that they are forwarded transparently to the underlying grpc stack.

It will still depend on what the server side supports, but there are some options to explore

Regards, Jeroen

On Wed, Oct 27, 2021 at 5:10 PM Andrew Southard @.***> wrote:

Hi Anton, ok, I understand what you are saying about the scope of pygnmi - that makes sense. I will try and add something to my application logic as you suggest.

Thanks Andy

On Sat, Oct 23, 2021 at 2:57 PM Anton Karneliuk @.***> wrote:

Hello @andsouth44 https://github.com/andsouth44 ,

I think, that is a fundamental question. So far, the idea behind pygnmi is to be a pure python implementation of gNMI API. What you mentions, stays outside of gNMI API; it is rather is an application logic, which you need to build on top of the GNMI API. As such, we'll take a look on your news you mentioned and how it is related to the pygnmi core purpose.

Best, Anton

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-950214193 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADJBL7QUGNTTODDV34KE6PDUIMOR5ANCNFSM5FCVGU6A

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-953348153, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPQACZMDTM6KSWWMW2QGE3UJB2FBANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

andsouth44 commented 2 years ago

Thanks, Jeroen! This is very helpful.

On Thu, Oct 28, 2021 at 9:23 AM J vanBemmel @.***> wrote:

Just a note: I found https://github.com/grpc/grpc/issues/15260

An example of gRPC settings you can tune to enable keep-alives for idle connections. pygnmi does "support" these parameters in the sense that they are forwarded transparently to the underlying grpc stack.

It will still depend on what the server side supports, but there are some options to explore

Regards, Jeroen

On Wed, Oct 27, 2021 at 5:10 PM Andrew Southard @.***> wrote:

Hi Anton, ok, I understand what you are saying about the scope of pygnmi - that makes sense. I will try and add something to my application logic as you suggest.

Thanks Andy

On Sat, Oct 23, 2021 at 2:57 PM Anton Karneliuk @.***> wrote:

Hello @andsouth44 https://github.com/andsouth44 ,

I think, that is a fundamental question. So far, the idea behind pygnmi is to be a pure python implementation of gNMI API. What you mentions, stays outside of gNMI API; it is rather is an application logic, which you need to build on top of the GNMI API. As such, we'll take a look on your news you mentioned and how it is related to the pygnmi core purpose.

Best, Anton

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-950214193 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADJBL7QUGNTTODDV34KE6PDUIMOR5ANCNFSM5FCVGU6A

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-953348153, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAPQACZMDTM6KSWWMW2QGE3UJB2FBANCNFSM5FCVGU6A

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/akarneliuk/pygnmi/issues/45#issuecomment-953952649, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJBL7XYCBYSEJ4QD24VF2LUJFTGXANCNFSM5FCVGU6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

akarneliuk commented 2 years ago

Hello @jbemmel , I've merged that, thank you. Will write some unit tests to see how it goes.

akarneliuk commented 2 years ago

Hey @jbemmel , @andsouth44 the feature is added to the release 0.6.2. Let me know if anything else is needed.