Closed mpounsett closed 2 years ago
I think any kind of exception being raised during the taking of a measurement means that the measurement couldn't be taken. By definition that can't be a critical or warning, since you don't know the result of the measurement you were trying to take.
If, for example, you want to know "is this port answering" and the answer is "no" then the connect failure exception should be caught and a False or 0 (zero) result returned. If you're trying to measure whether a daemon is returning the correct content and you're getting an exception because of a timeout, that is the very definition of UNKNOWN. I don't think these two examples should be mixed in a single test, which is the sort of thing that would lead to an unexpected exception leading you to want to return CRITICAL.
For that reason I'm going to mark this wontfix
, because I don't think it's a bug that unhandled exceptions result in an UNKNOWN state.
If you've got an argument for why I'm wrong I'm willing to entertain the idea... I just can't think of a use case where I think this is a good idea.
Original report by Christian Kauhaus (Bitbucket: ckauhaus, GitHub: ckauhaus).
spaans@fox-it.com:
I'd like to share one small bit of code which I found myself reusing again and again, for situations in which you need want to signal that something went wrong, but cannot do that by crossing a (possibly unknown) threshold or raising an exception (because an exception might actually mean critical failure instead of unknown). This is the idiom I use:
If you find this a useful scenario as well, go ahead and put it into the nagiosplugin distribution.