Uninett / zino

Zino 2.0 - Network state monitor for research networks
Apache License 2.0
3 stars 4 forks source link

Alarm count is not int #212

Closed runborg closed 1 week ago

runborg commented 2 months ago

After running the server for a while I get quite a number of devices that returns a string instead of an int alarm count. This gives the error:

Device XXX returns alarm count not of type int. Yellow alarm count: type <class 'str'>. Red alarm count: type <class 'str'>.

the error does not give any indications of whats inside the value retrieved. if this is a unsupported feature on the platform or if there are other issues?

Looking into this it looks like the same routers reporting this error and the "router count" is stable...

(zino-env) --> $ cat log  | grep ERROR | grep "Yellow alarm count:" | cut -d\  -f 9- | sort | uniq -c | sort -h | wc -l
48

48 out of our routers returns this, and i'm able to produce a list of affected routers if there is interest in this for troubleshooting.

johannaengland commented 1 month ago

The value causing this error is also logged, but at a DEBUG level, this can be seen here: https://github.com/Uninett/zino/blob/master/src/zino/tasks/juniperalarmtask.py#L51-L55

johannaengland commented 1 month ago

When changing the log level to debug we see what the values recorded are:

2024-05-29 15:07:21,953 - DEBUG - zino.tasks.juniperalarmtask (MainThread) - Yellow alarm count: value ''. Red alarm count: value ''.

And when running snmpget -v2c -c community host-name 1.3.6.1.4.1.2636.3.4.2.2.2 (jnxYellowAlarmCount) on such a router we get

iso.3.6.1.4.1.2636.3.4.2.2.2 = No Such Object available on this agent at this OID

so it seems like for juniper routers that do not have a yellow/red alarm count this error happens.

hmpf commented 3 weeks ago

This might be relevant for NAV as well.

lunkwill42 commented 1 week ago

This might be relevant for NAV as well.

I don't think so. NAV's SNMP code is a bit more mature than Zino 2 :)

stveit commented 1 week ago

Not entirely closed until changes in the alarmtask is done