Open ghost opened 6 years ago
I can confirm it is hanging at get_table on line 531:
$resultat_param
= (version->parse(Net::SNMP->VERSION) < 4)
? $session->get_table($run_param_table)
: $session->get_table(Baseoid => $run_param_table);
The snmpd service puts out the following in the log when check_snmp_process.pl ran with the -A flag
Jul 31 15:04:47 localhost snmpd[89732]: send response: Too long (plaintext scopedPDU header type 00: s/b 30)
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1025
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1036
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1037
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1038
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1041
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1179
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1180
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1181
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1182
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1185
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1420
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1422
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1425
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1433
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1434
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1438
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1444
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1445
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1446
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1449
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1454
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1462
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1464
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1467
Jul 31 15:04:47 localhost snmpd[89732]: -- HOST-RESOURCES-MIB::hrSWRunParameters.1469
Coincidentally i also have a java process with a a long list off arguments. in my case the longest path + argument is 784 characters long.
Increasing the octet length with parameter "-o 4096" from default 1472 to 4096 removes the error message from snmpd logs but check_snmp_process.pl still fails with no aswer from host.
Ok i done some more testing. It works if you change the code the user posted above, if you change it to add maxrepetitions to the get_table requets it works. example:
$resultat
= (version->parse(Net::SNMP->VERSION) < 4)
? $session->get_table($run_name_table)
: $session->get_table(Baseoid => $run_param_table,maxrepetitions => 10)`
According to Net::SNMP maxrepetitions is automatically calculated if not present. And if I understand correctly it is how many rows [NET::SNMP](https://metacpan.org/pod/Net::SNMP#get_table()-retrieve-a-table-from-the-remote-agent) gets per request. I did some testing to see how long it took with different maxrepetitions. and this is the real execution time accourding to time. I used a 60 timeout on check_snmp_process.pl
maxrepetitions | Time |
---|---|
0 | 17.351s |
1 | 17.386s |
2 | 9.216s |
5 | 4.641s |
10 | 3.214s |
20 | 2.338s |
22 | 2.286s |
25 | failed |
30 | failed |
40 | failed |
I have not yet identified what is causing this but on a specific server the -A causes it to hang and not match any processes. The same check against other servers works fine. We have some java processes on this server with a very long list of arguments and I wonder if that could be to blame.
I'll continue to try to debug further.