Closed steffenschumacher closed 3 years ago
Hi @steffenschumacher , thanks for submitting this.
Your configuration and query are OK, so, let me analyse the data that is stored on InfluxDB assuming you are polling the device each 60secs:
...
t1: 1583454480000000000 127277.80955456638
t2: 1583454540000000000 729753.6578762988
t3: 1583454881000000000 432340776743607940 <<<
t4: 1583454904000000000 9849.831821863118
t5: 1583454960000000000 9286.73326600024
...
Time ID | Timestamp | Value | Elapsed from previous (s) |
---|---|---|---|
t1 | 1583454480000000000 | 127277.80955456638 | 60 |
t2 | 1583454540000000000 | 729753.6578762988 | 60 |
t3 | 1583454881000000000 | 432340776743607940 | 341 |
t4 | 1583454904000000000 | 9849.831821863118 | 23 |
t5 | 1583454960000000000 | 9286.73326600024 | 56 |
t6 | 1583455020000000000 | 8042.6583283225955 | 60 |
As you can see on the table above on Elapsed from previous (s)
column, seems that there is a period betweeen t2->t3
that the metric not being retrieved.
So, in order to solve counter overflow in case that metric not being retrived by some interval, we recommend you to:
To review what is happening to the device and why the metrics are not being pulled, we recommend you to:
DEBUG
and review the logs (you can do it from device config or directly on runtime)t3-->t4
, maybe the device falls unresponsiveThanks, Regards!
Hmm ok, I guess that's worth a try - the devices we have globally will now and then be unreachable, so for certain shorter durations, polling will be disrupted. But, if the theory is correct - namely that increasing the polling frequency to eg. 341 seconds - will cause counter overflow (64 bit counter), then assuming this is occurring for the counter incrementing octets at 10 mbps means the counter should overflow every: 2^63 (assuming signed) / (10 mbps*8bit) = 115292150460 seconds or every 3655 years. So, that's why I'm still not 100% understanding how it could be overflow issues - unless it really WERE 32 bit counters - then it would overflow every 53 seconds, and make a whole lot of sense. But it must be 64 bits, since the values inserted are > 32 bits.
Anyways, I'll try to start logging, and setup a separate measurement of this without get rate..
Hi @steffenschumacher ,
As you have said, the counter shouldn't overflow. In order to review it (even we have not any issuee related with IfMIB counters) I need to ask you the following:
As you have said, please, try to get some logs and see what is being gathered!
Thanks, Regards!
@steffenschumacher I will close this issue due to inactivity , you can reopen it if needed.
Setup: docker.io/hyber/snmpcollector latest(v0.8.0) c4fdd9a88fdf Target: Cisco C1111-8P router, interface Gi0/0/0, ~300 ms RTD away from collector, having 10 mbps wan, being polled every 60 secs Issue: extreme rate data - possibly overflow related, however not obvious due to relatively high polling frequency + 64 bit counters:
This oid is configured as:
Note, this is seen on various hardware: Cisco: C1111-8P, C3560CX, C3560V2, C892 Riverbed: Steelhead CXA-00255-B020
Suggestion - this can be mitigated if we can provide a cap-value for each OID, such that exceeding the cap, omits inserting data - obviously the fix is preferred.