Closed christianuhlmann closed 6 years ago
adding original output from storcli commands
Can you also post the output of the plugin with "-vvv" enabled?
Can you also check if the storcli commands are returning 0 and controller number is correct? I have a test script, where my storcli is just a wrapper around printing the output on the command line and let the plugin parse the output:
./check_lsi_raid -p ./storcli-testing -C 0 -vvv
Use of uninitialized value $controllerToCheck{"ROC temperature"} in concatenation (.) or string at ./check_lsi_raid line 1126.
Critical (LD Warn, PD Warn, BBU Warn, CV Crit) [CV_Replacement_required = Critical (Yes)][c0/v1_State = Critical (Dgrd)][c0/v2_State = Critical (BLa)][c0/e252/s1_State = Critical (Dgrd)][c0/e252/s2_State = Critical (Offln)][BBU_State = Warning (Failed)][BBU_Voltage = Warning (Low)][c0/v0_Consist = Warning (No)][c0/v1_Consist = Warning (No)][c0/v1_Init = Warning (25)][c0/e252/s0_BBM_error_count = Warning (15)][c0/e252/s1_Predictive_failure_count = Warning (10)][c0/e252/s1_SMART_flag = Warning][c0/e252/s2_Shield_counter = Warning (5)][c0/e252/s2_Media_error_count = Warning (3)][c0/e252/s2_Init = Warning (33)][c0/e252/s1_Rebuild = Warning (25)][c0/e252/s3_Rebuild = Warning (60)]|BBU_Temperature=28;50;60 CV_Temperature=23;70;85 c0/e252/s0_Drive_Temperature=31;40;45 c0/e252/s1_Drive_Temperature=30;40;45
Used storcli commands:
- /usr/bin/sudo ./storcli-testing /c0 /bbu show status
- /usr/bin/sudo ./storcli-testing /c0 /cv show status
- /usr/bin/sudo ./storcli-testing adpallinfo a0 -NoLog
- /usr/bin/sudo ./storcli-testing /c0/vall show all
- /usr/bin/sudo ./storcli-testing /c0/vall show init
- /usr/bin/sudo ./storcli-testing /c0/eall/sall show all
- /usr/bin/sudo ./storcli-testing /c0/eall/sall show initialization
- /usr/bin/sudo ./storcli-testing /c0/eall/sall show rebuild
Critical sensors:
- CV_Replacement_required (Yes)
- c0/v1_State (Dgrd)
- c0/v2_State (BLa)
- c0/e252/s1_State (Dgrd)
- c0/e252/s2_State (Offln)
Warning sensors:
- BBU_State (Failed)
- BBU_Voltage (Low)
- c0/v0_Consist (No)
- c0/v1_Consist (No)
- c0/v1_Init (25)
- c0/e252/s0_BBM_error_count (15)
- c0/e252/s1_Predictive_failure_count (10)
- c0/e252/s1_SMART_flag
- c0/e252/s2_Shield_counter (5)
- c0/e252/s2_Media_error_count (3)
- c0/e252/s2_Init (33)
- c0/e252/s1_Rebuild (25)
- c0/e252/s3_Rebuild (60)
CTR information:
- LSI MegaRAID SAS 9260-8i:
- Serial No=SV13800819
- FW Package Build=12.12.0-0111
- Mfg. Date=09/10/11
- Revision No=79B
- BIOS Version=3.24.00_4.12.05.00_0x05160000
- FW Version=2.130.353-1663
- ROC temperature=
I have inserted your output from above and besides from the ROC temperature the output seems fine and the plugin can also parse it. Also the output from "show time" seems valid.
Therefore I assume we have some troubles with the setup...
Hi Thanks for your analysis, unfortunately I do not know what is wrong with the setup. do you have a hint for me?
here the output with -vvv
'/usr/lib64/nagios/plugins/check_lsi_raid' '-C' '0' '-p' '/usr/sbin/storcli' -vvv
Use of uninitialized value $controllerToCheck{"ROC temperature"} in concatenation (.) or string at /usr/lib64/nagios/plugins/check_lsi_raid line 1127.
Critical (LD Warn, BBU Crit) [BBU_Firmware_temperature = Critical (High)][c0/v0_Consist = Warning (No)][c0/v1_Consist = Warning (No)]|BBU_Firmware_temperature=High BBU_Temperature=46;50;60 c0/e252/s4_Drive_Temperature=34;40;45 c0/e252/s5_Drive_Temperature=36;40;45 c0/e252/s6_Drive_Temperature=34;40;45
Used storcli commands:
- /usr/sbin/storcli /c0 /bbu show status
- /usr/sbin/storcli adpallinfo a0 -NoLog
- /usr/sbin/storcli /c0/vall show all
- /usr/sbin/storcli /c0/vall show init
- /usr/sbin/storcli /c0/eall/sall show all
- /usr/sbin/storcli /c0/eall/sall show initialization
- /usr/sbin/storcli /c0/eall/sall show rebuild
Critical sensors:
- BBU_Firmware_temperature (High)
Warning sensors:
- c0/v0_Consist (No)
- c0/v1_Consist (No)
CTR information:
- LSI MegaRAID SAS 9260-8i:
- Serial No=SV13800819
- FW Package Build=12.12.0-0111
- Mfg. Date=09/10/11
- Revision No=79B
- BIOS Version=3.24.00_4.12.05.00_0x05160000
- FW Version=2.130.353-1663
- ROC temperature=
LD information:
- c0/v0:
- Access=RW
- Cac=-
- Cache=RWTD
- Consist=No
- DG/VD=0/0
- Size=232.375
- State=Optl
- TYPE=RAID1
- ld=c0/v0
- sCC=ON
- c0/v1:
- Access=RW
- Cac=-
- Cache=RWTD
- Consist=No
- DG/VD=1/1
- Size=5.457
- State=Optl
- TYPE=RAID5
- ld=c0/v1
- sCC=ON
PD information:
- c0/e252/s0:
- DG=0
- DID=14
- Drive Temperature=N/A
- EID:Slt=252:0
- Intf=SATA
- Med=SSD
- Media Error Count=0
- Model=Samsung SSD
- Other Error Count=0
- PI=N
- Predictive Failure Count=0
- S.M.A.R.T alert flagged by drive=No
- SED=Y
- SeSz=512B
- Shield Counter=0
- Size=232.375GB
- Sp=850
- State=Onln
- pd=c0/e252/s0
- c0/e252/s1:
- DG=0
- DID=15
- Drive Temperature=N/A
- EID:Slt=252:1
- Intf=SATA
- Med=SSD
- Media Error Count=0
- Model=Samsung SSD
- Other Error Count=0
- PI=N
- Predictive Failure Count=0
- S.M.A.R.T alert flagged by drive=No
- SED=Y
- SeSz=512B
- Shield Counter=0
- Size=232.375GB
- Sp=850
- State=Onln
- pd=c0/e252/s1
- c0/e252/s4:
- DG=1
- DID=16
- Drive Temperature=34C (93.20 F)
- EID:Slt=252:4
- Intf=SATA
- Med=HDD
- Media Error Count=0
- Model=WDC WD30EFRX-68EUZN0
- Other Error Count=0
- PI=N
- Predictive Failure Count=0
- S.M.A.R.T alert flagged by drive=No
- SED=N
- SeSz=512B
- Shield Counter=0
- Size=2.728TB
- Sp=U
- State=Onln
- pd=c0/e252/s4
- c0/e252/s5:
- DG=1
- DID=17
- Drive Temperature=36C (96.80 F)
- EID:Slt=252:5
- Intf=SATA
- Med=HDD
- Media Error Count=0
- Model=WDC WD30EFRX-68EUZN0
- Other Error Count=0
- PI=N
- Predictive Failure Count=0
- S.M.A.R.T alert flagged by drive=No
- SED=N
- SeSz=512B
- Shield Counter=0
- Size=2.728TB
- Sp=U
- State=Onln
- pd=c0/e252/s5
- c0/e252/s6:
- DG=1
- DID=18
- Drive Temperature=34C (93.20 F)
- EID:Slt=252:6
- Intf=SATA
- Med=HDD
- Media Error Count=0
- Model=WDC WD30EFRX-68EUZN0
- Other Error Count=0
- PI=N
- Predictive Failure Count=0
- S.M.A.R.T alert flagged by drive=No
- SED=N
- SeSz=512B
- Shield Counter=0
- Size=2.728TB
- Sp=U
- State=Onln
- pd=c0/e252/s6
BBU information:
- BBU_Firmware_temperature=High
- BBU_Status=Critical
- BBU_Temperature=46
hi again,
very strange, today i checked egain and now it works without any changes. should be a temporary problem.
Hi,
I use the following storcli version here:
on a xenserver 7.4 or xcp-ng 7.4.1, base is a centos 7.
the storcli runs flawlessly under root as well as under the users nagios and icinga, both got the necessary rights over sudoers.
I do not use NRPE, but the check script always outputs new errors:
Error 1:
Solution: following lines
replace with
Output from commandline:
Error 2:
Code Lines:
Solution: no idea Output from commandline:
Here I once stopped analyzing, but something seems to be wrong.
Can someone give me a hint?
Thanks and Greetings Christian