openconfig / reference

This repository contains reference implementations, specifications and tooling related to OpenConfig-based network management.
Apache License 2.0
155 stars 88 forks source link

Clarify Bundling and importance of Timestamps #165

Closed gcsl closed 2 years ago

gcsl commented 2 years ago

Remove the underspecified reference to out of band negotiation on bundling and clarify when bundling is prohibited because it would destroy accurate timestamp information.

hellt commented 2 years ago

Hi @gcsl since this PR touches timestamps a bit, can someone clarify how the target should behave with regards to timestamps in Notifications, when a client requests a sampled subscription for a leaf that changes rarely.

For example, if a client subscribes to an interfaces description leaf with a stream mode and sample rate of 10s, can the target always report in its Notification message the timestamp at which the description was last changed? Or should it instead provide a timestamp at which the collection of said leaf happened, thus each Notification would contain different TS value? Or both is possible as per the specification?

robshakir commented 2 years ago

Hi @gcsl since this PR touches timestamps a bit, can someone clarify how the target should behave with regards to timestamps in Notifications, when a client requests a sampled subscription for a leaf that changes rarely.

For example, if a client subscribes to an interfaces description leaf with a stream mode and sample rate of 10s, can the target always report in its Notification message the timestamp at which the description was last changed? Or should it instead provide a timestamp at which the collection of said leaf happened, thus each Notification would contain different TS value? Or both is possible as per the specification?

The timestamp should be the time at which the leaf was set/sample was collected NOT the time at which the update was sent on the wire. I think we say this somewhere in the spec, but we should clarify if not.

The reasoning for this is that if you have, say, an interface in a DOWN state, and you consistently update the timestamp, it's not clear whether this was a transition in state, or whether the value has been static for the entire time.

(Sorry, I saw that you'd said something similar in another PR/issue, but hadn't got to replying.)

hellt commented 2 years ago

The timestamp should be the time at which the leaf was set/sample was collected NOT the time at which the update was sent on the wire. I think we say this somewhere in the spec, but we should clarify if not.

The reasoning for this is that if you have, say, an interface in a DOWN state, and you consistently update the timestamp, it's not clear whether this was a transition in state, or whether the value has been static for the entire time.

thanks @robshakir sorry, if the time at which the leaf was set is clear, NOT the time at which the update was sent on the wire is also clear; what is not entirely clear to me is what another valid TS value -sample was collected - specifically means?

For example, lets say a leaf /a was set to "UP" at TS=1 and stays in that state.

If I do a Subscribe RPC with sampling of 10s, can a target report Notificaion messages back to me with timestamps at which target internally reached out to management server of a device to get the state of /a with each sample interval?

What I understood so far, the target can report a static TS=1 as well, as this is a time at which this leaf changed its value from null or prev state.

robshakir commented 2 years ago

thanks @robshakir sorry, if the time at which the leaf was set is clear, NOT the time at which the update was sent on the wire is also clear; what is not entirely clear to me is what another valid TS value -sample was collected - specifically means?

There are two types of data on the system - something that is updated based on an event (e.g., oper-status changed at some point in time), and then data sources that are sampled (e.g., packet counters on an NPU). My intention with this was to say that in the case of the event-based updates, the timestamp is the time of the event. In the case of the 'sampled' there isn't really a 'timestamp' that the event occurred, just the timestamp that we read from the underlying NPU. You could say that this was the time that the collected sample was 'set', but this seemed a bit of a tenuous definition to me.

Does that make sense? Happy to discuss more if not.

For example, lets say a leaf /a was set to "UP" at TS=1 and stays in that state.

If I do a Subscribe RPC with sampling of 10s, can a target report Notificaion messages back to me with timestamps at which target internally reached out to management server of a device to get the state of /a with each sample interval?

No, this wouldn't be legal in my view. Since you're implying here that the value of /a became "UP" is the timestamp of the notification - so it should always be 1.

What I understood so far, the target can report a static TS=1 as well, as this is a time at which this leaf changed its value from null or prev state.

hellt commented 2 years ago

My intention with this was to say that in the case of the event-based updates, the timestamp is the time of the event. In the case of the 'sampled' there isn't really a 'timestamp' that the event occurred, just the timestamp that we read from the underlying NPU. You could say that this was the time that the collected sample was 'set', but this seemed a bit of a tenuous definition to me.

Does that make sense? Happy to discuss more if not.

That is where confusion sourced on my end. I was under the impression that gnmi Subscribe RPCs are agnostic of the underlying "nature" of the data on the system. Meaning that as a gNMI client I do not know if /a is meant to be event-driven or not.

The root for this question is sourced with the way a popular TSDB - Prometheus - works with metrics. As a TSDB, it expects to receive a constant influx of metrics, thus, if we put things into the networking domain, /interface/description should be scraped by Prometheus with a regular intervals, otherwise the metric will be reported as stale.

Another mental rule that I had in my mind was that gNMI Subscribe with sampling was meant to report back to a client a constant stream of updates for a certain Path, regardless if it is event-driven or not.

I still find this topic unregulated. Moreover, my previous experience with various vendors show that lots of them provide changing timestamps for a sampled subscribtion on a leaf with a rarely-changed vaue. I wish this would be clarified, otherwise there is a lot of flux in how targets report notifications for such leaves, which is likely not the designed intent of a spec?

robshakir commented 2 years ago

My intention with this was to say that in the case of the event-based updates, the timestamp is the time of the event. In the case of the 'sampled' there isn't really a 'timestamp' that the event occurred, just the timestamp that we read from the underlying NPU. You could say that this was the time that the collected sample was 'set', but this seemed a bit of a tenuous definition to me. Does that make sense? Happy to discuss more if not.

That is where confusion sourced on my end. I was under the impression that gnmi Subscribe RPCs are agnostic of the underlying "nature" of the data on the system. Meaning that as a gNMI client I do not know if /a is meant to be event-driven or not.

The client doesn't need to know whether it's event driven to implement this AFAICS, the server needs to set the timestamp correctly, and the client just accepts this as the time that this value 'became active' (see discussion earlier).

The root for this question is sourced with the way a popular TSDB - Prometheus - works with metrics. As a TSDB, it expects to receive a constant influx of metrics, thus, if we put things into the networking domain, /interface/description should be scraped by Prometheus with a regular intervals, otherwise the metric will be reported as stale.

So, this is a different question. Here you're saying the downstream TSDB needs me to 'refresh' a metric to keep it current. To do this, the client itself is basically saying "I don't care about the timestamp" -- if this is the case, then you're well within your rights to 'refresh' a value when inserting it into the TSDB with the timestamp of your choosing. The gNMI API is concerned with carrying the right information.

Now, there is the question of "how do I maintain the advantages of source timestamping for e.g., rates being more accurate, whilst also refreshing values into the TSDB". This can't be achieved by having the timestamp that the server sets be set to be equal to the timestamp it was sent, as this loses the advantage of timestamp accuracy, but neither can it be achieved by rewriting all timestamps at the client to the new value at the receiver. Instead, you need someone that is schema-aware to do this. In the OpenConfig models, we have begun labelling values as on-change in the schema, which allows the server to know what the expected mode is, but also would allow the client to be able to do such selective re-writing.

Another mental rule that I had in my mind was that gNMI Subscribe with sampling was meant to report back to a client a constant stream of updates for a certain Path, regardless if it is event-driven or not.

That's correct. The value is reported back, but the timestamp reflects when the event happened, just because we're asking for the value of /interfaces/interface/description doesn't mean that it changed every time.

I still find this topic unregulated. Moreover, my previous experience with various vendors show that lots of them provide changing timestamps for a sampled subscribtion on a leaf with a rarely-changed vaue. I wish this would be clarified, otherwise there is a lot of flux in how targets report notifications for such leaves, which is likely not the designed intent of a spec?

I think that this makes sense, I do not, however, necessarily feel that this depth is something that we should clarify in the base gNMI specification. I think we likely need an implementation note, or spec addendum that describes this in more detail. Would you be interested in starting a PR with something like this in, I'm happy to collaborate (and I expect @gcsl will have some input too).

Thanks, r.

hellt commented 2 years ago

thanks @robshakir this is a very interesting thread and something that was not at all clear to me when I read gnmi spec

Just to cross the t here, as per your comments this behaviour seems to be incorrect?

1sec sampled subscription is made targeting a leaf oper-state that stays up for the duration of a subscribtion the output indicates that TS is changing within the 1s window, whilist the value is not chaning.

❯ gnmic -a leaf1 --skip-verify -u admin -p admin sub --stream-mode sample --sample-interval 1s --path "/interface[name="ethernet-1/1"]/oper-state" 

{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205706685299943,
  "time": "2022-05-10T20:01:46.685299943+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}
{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205707686242519,
  "time": "2022-05-10T20:01:47.686242519+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}
robshakir commented 2 years ago

My interpretation is that the behaviour there is incorrect, yes. @gcsl @marcushines - WDYT?

robshakir commented 2 years ago

LGTM to merge, let's see whether Marcus wants to comment here on the change and then go forward.

earies commented 2 years ago

Just catching up on this thread and the PR contents itself.

First on the comments @hellt raised here - this behavior definitely needs to live somewhere otherwise we'll see completely different implementation behaviors as I'm sure can be witnessed across shipping gNMI implementations today. A change in this behavior as well impacts clients as they may now look to the time at which the Notification PDU was received to marshal to a TSDB vs. a possibly never changing Timestamp field in the Notification payload.

Second - I'd be now curious as to the opinion if we need any YANG leaves such as last-change associated to an object (picking on the interface oper-status example) since this is now conveyed via other means. These leaves can be looked at as redundant however if they are removed from the modeling itself, we negate other NBI short of equivalent protocol extensions.

As far as the PR contents - this LGTM per our last discussion. It looks like the revision history needs updating and it may be worth noting that the gNMI specification carries it's own versioning independent of the protobuf IDL which may call for a reference between the two (Most folks refer to adherence to the proto IDL solely but the implementation behaviors defined here and in other surrounding documents is >= of importance)

marcushines commented 2 years ago

thanks @robshakir this is a very interesting thread and something that was not at all clear to me when I read gnmi spec

Just to cross the t here, as per your comments this behaviour seems to be incorrect?

1sec sampled subscription is made targeting a leaf oper-state that stays up for the duration of a subscribtion the output indicates that TS is changing within the 1s window, whilist the value is not chaning.

❯ gnmic -a leaf1 --skip-verify -u admin -p admin sub --stream-mode sample --sample-interval 1s --path "/interface[name="ethernet-1/1"]/oper-state" 

{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205706685299943,
  "time": "2022-05-10T20:01:46.685299943+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}
{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205707686242519,
  "time": "2022-05-10T20:01:47.686242519+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}

So this is actually problematic as mostly the behavior around "proper" timestamping on notifications on a sample or poll basis is going to very based on the internal implementation details

so the sample must maintain the original edge timestamps of the event and simply "resend the notifications with old timestamps" and for values which changed they should have "exact" timestamp that the event was observed now depending on implementation it might only be "observable" the frequency of the poll or the sample so. you will lose some resolution

also in the case say oper-status your statement is correct if the previous sample was "UP" and your new sample is still "UP" then you could hold the original timestamp but also your internal implementation detail might hold intervening updates so sending a timestamp of the last observed change would also be correct since the device did observe "intervening" states

hellt commented 2 years ago

thanks @robshakir this is a very interesting thread and something that was not at all clear to me when I read gnmi spec Just to cross the t here, as per your comments this behaviour seems to be incorrect?

1sec sampled subscription is made targeting a leaf oper-state that stays up for the duration of a subscribtion the output indicates that TS is changing within the 1s window, whilist the value is not chaning.

❯ gnmic -a leaf1 --skip-verify -u admin -p admin sub --stream-mode sample --sample-interval 1s --path "/interface[name="ethernet-1/1"]/oper-state" 

{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205706685299943,
  "time": "2022-05-10T20:01:46.685299943+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}
{
  "source": "leaf1",
  "subscription-name": "default-1652205706",
  "timestamp": 1652205707686242519,
  "time": "2022-05-10T20:01:47.686242519+02:00",
  "updates": [
    {
      "Path": "interface[name=ethernet-1/1]/oper-state",
      "values": {
        "interface/oper-state": "up"
      }
    }
  ]
}

So this is actually problematic as mostly the behavior around "proper" timestamping on notifications on a sample or poll basis is going to very based on the internal implementation details

so the sample must maintain the original edge timestamps of the event and simply "resend the notifications with old timestamps" and for values which changed they should have "exact" timestamp that the event was observed now depending on implementation it might only be "observable" the frequency of the poll or the sample so. you will lose some resolution

also in the case say oper-status your statement is correct if the previous sample was "UP" and your new sample is still "UP" then you could hold the original timestamp but also your internal implementation detail might hold intervening updates so sending a timestamp of the last observed change would also be correct since the device did observe "intervening" states

I don't quite grok the last sentence. It looks like we have to branch this thread to a separate issue to flesh out the way it should/must work.

What I recall from my tests is that only Arista EOS reports back same TS when a sampled subscription is made to a leaf with a constant value. But even then, they stop sending periodic updates with the same TS and just send a few updates and then streaming hangs. Others, and I did some tests on Nokia SR Linux, SR OS, Juniper vMX change the TS for leaves with constant values when the sampled subscribtion is in play

robshakir commented 2 years ago

Yes, let's fork this one please @hellt.

@gcsl - this looks ready to submit.