markormesher / iperf-prometheus-collector

GNU Affero General Public License v3.0
3 stars 1 forks source link

Expose iperf errors as prometheus metrics #31

Closed gberche-orange closed 1 day ago

gberche-orange commented 3 days ago

Expected behavior

As a user, In order to visualize iperf errors in prometheus/grafana I need iperf errors (server unreacheable, server busy) to be exposed as metrics and/or labels

Here is a 1st though with a new counter metric

iperf_errors{ msg="the server is busy running a test. try again later"} 1
iperf_errors{ msg="unable to send control message: Bad file descriptor"} 2

Current behavior

IPerf does not seem to include errors in the json output, not supporting options to do so, see https://github.com/esnet/iperf/blob/master/docs/invoking.rst

As a result, errors are currently only visible on the container stdout, as shown below

[2024-10-16T11:32:35.085Z] Refreshing metrics...                                                                                                                                                                   
[2024-10-16T11:32:35.122Z] Could not get iperf metrics for iperf-private-listener-r1-z2-2.domain.org Error: Command failed: iperf3 -c iperf-private-listener-r1-z2-2.domain.org --udp --json -p 443 -4 -Z -t 10 --bitrate 10m

    at ChildProcess.exithandler (node:child_process:402:12)                                              
    at ChildProcess.emit (node:events:513:28)                                                                                                                                                                      
    at maybeClose (node:internal/child_process:1100:16)                                                  
    at Socket.<anonymous> (node:internal/child_process:458:11)                                                                                                                                                     
    at Socket.emit (node:events:513:28)                                                                  
    at Pipe.<anonymous> (node:net:301:12) {                                                                                                                                                                        
code: 1,                                                                                                                                                                                                         
killed: false,                                                                                                                                                                                                   
signal: null,                                                                                                                                                                                                    
cmd: 'iperf3 -c iperf-private-listener-r1-z2-2.domain.org --udp --json -p 443 -4 -Z -t 10 --bitrate 10m',                                                                         
stdout: '{\n' +                                                                                        
'\t"start":\t{\n' +                                                                                                                                                                                            
'\t\t"connected":\t[],\n' +                                                                                                                                                                                    
'\t\t"version":\t"iperf 3.14",\n' +                                                                                                                                                                            
'\t\t"system_info":\t"Linux iperf-exporter-from-r1-z2-private-to-r1-z2-private-protocow8xzg 5.15.0-119-generic #129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024 x86_64"\n' +
'\t},\n' +                                                                                                                                                                                                     
'\t"intervals":\t[],\n' +                                                                            
'\t"end":\t{\n' +                                                                                                                                                                                              
'\t},\n' +                                                                                                                                                                                                     
'\t"error":\t"the server is busy running a test. try again later"\n' +                                                                                                                                         
'}\n',                                                                                                                                                                                                         
stderr: ''                                                                                             
}                     
markormesher commented 1 day ago

This is partly addressed in #35 by exposing the number of failed tests. This tool isn't designed to be a complete interface into iperf, just a way to expose metrics, so I don't think it's appropriate to use it to consume arbitrary error messages.