Closed ostasevych closed 9 months ago
Are you sure your Smart Array is still using the cciss
driver?
Are you sure your Smart Array is still using the
cciss
driver?
Well, at least, smartctl
responds with cciss
attribute (you may see the temperature dynamic in the chart).
However there's no directory /dev/cciss
, so I suppose it uses hpsa
driver. How to check that, meanwhile?
One more questions: how to make the right charts representative, so the reallocated sector count and current pending sectors are shown as well?
Can you, please, paste here in the output of smartctl -A /dev/sda
and smartctl -A /dev/sda -d cciss,0
?
Here it is:
# smartctl -A /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-6.5.0-17-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/sda: requires option '-d cciss,N'
Please specify device type with the -d option.
Use smartctl -h to get a usage summary
# smartctl -A /dev/sda -d cciss,0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-6.5.0-17-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 284
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 26
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
161 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 0
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 5
194 Temperature_Celsius 0x0032 100 100 050 Old_age Always - 22
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 100
241 Total_LBAs_Written 0x0032 100 100 050 Old_age Always - 1
242 Total_LBAs_Read 0x0032 100 100 050 Old_age Always - 223947
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
The problem with mapping is that Monitorix don't know how to handle the key and the value, since the key contains spaces. That's why there is the option -s
in the command line of Monitorix.
Try adding -s equalsign
in the Monitorix command line.
You don't need to touch the systemd unit file, just modify the file in /etc/sysconfig/monitorix
where you can add extra command line arguments.
Then restart Monitorix, and you should see your map strings appear in the graph.
One more questions: how to make the right charts representative, so the reallocated sector count and current pending sectors are shown as well?
As long as the output of your smartctl
command shows the information, Monitorix will show the values of these attributes.
-s equalsign
thanks, it works!
Perfect!
Hi I would like to open the ticket again.
So, I have tried to use /dev/disks/by-path instead of physical /dev/sdX indication and found that it still doesn't support spaces:
/etc/monitorix/monitorix.conf
# cat /etc/monitorix/monitorix.conf | grep -5 disk
...
<disk>
<list>
0 = /dev/disk/by-path/pci-0000:00:14.1-ata-1.1
# 1 = /dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:0
# 1 = "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:0 -d cciss,0", "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:0 -d cciss,1"
1 = "/dev/sdb -d cciss,0", "/dev/sdb -d cciss,1"
2 = "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:2 -d cciss,2", "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:2 -d cciss,3"
# 2 = "/dev/sdc -d cciss,2", "/dev/sdc -d cciss,3"
3 = "/dev/disk/by-path/pci-0000:00:13.2-usb-0:1:1.0-scsi-0:0:0:0"
</list>
<desc>
0 = individual drive system disk 256GB
1 = RAID1 /dev/sdb data sdd array 500GB
2 = RAID1 /dev/sdc backup hdd array 1TB
3 = individual USB drive boot disk 2GB
</desc>
<map>
# pci-0000:02:00.0-scsi-0:1:0:0 = "data sdd array 500GB"
/dev/sdb -d cciss,0 = "data sdd 1 array 500GB"
/dev/sdb -d cciss,1 = "data sdd 2 array 500GB"
# pci-0000:02:00.0-scsi-0:1:0:2 = "backup hdd array 2TB"
/dev/sdc -d cciss,2 = "backup hdd 1 array 2TB"
/dev/sdc -d cciss,3 = "backup hdd 2 array 2TB"
pci-0000:00:14.1-ata-1.1 = "system disk 256GB"
pci-0000:00:13.2-usb-0:1:1.0-scsi-0:0:0:0 = "boot disk 2GB"
</map>
</alerts>
</disk>
...
Everything is fine if I use physical drive indication:
# 2 = "/dev/sdc -d cciss,2", "/dev/sdc -d cciss,3"
# smartctl -x /dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:2 -d cciss,2 | grep Celsius
194 Temperature_Celsius -O---K 112 109 000 - 35
Current Temperature: 35 Celsius
Power Cycle Min/Max Temperature: 29/36 Celsius
Lifetime Min/Max Temperature: 6/38 Celsius
So, the problem is in this line:
2 = "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:2 -d cciss,2", "/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:2 -d cciss,3"
I've placed -s equalsign
as you have suggested to the default options when it starts
# cat /etc/default/monitorix
OPTIONS="-s equalsign"
UPD: It seems the matter is in the colon symbol :
and spaces
, as it works fine with path
without extra options and with uid
, partuuid
, which doesn't contain colon symbol, with further options.
So, for me this configuration works:
<list>
0 = /dev/disk/by-uuid/887a2b16-5c24-47a8-8650-90fd7e6fc19d -d sat
1 = /dev/disk/by-path/pci-0000:00:14.1-ata-1.1
2 = "/dev/disk/by-uuid/9fc5ac7e-6660-4e7f-a069-c171e5ce7675 -d cciss,0", /dev/disk/by-uuid/9fc5ac7e-6660-4e7f-a069-c171e5ce7675 -d cciss,1"
3 = "/dev/disk/by-uuid/6f5a533b-ac42-4151-b467-d55d1cdd8075 -d cciss,2", /dev/disk/by-uuid/6f5a533b-ac42-4151-b467-d55d1cdd8075 -d cciss,3"
</list>
Can you check that?
The problem here is that the character colon has an special meaning (a separator) when creating the graph with RRDtool.
Since this is something I'm not sure I can fix from Monitorix, I'd recommend you to avoid using it as much as possible.
Can that be fixed by just extending the list if separators?
The thing is that it is much better to use one approach to indicate drives, in your case by-path.
Can that be fixed by just extending the list if separators?
Monitorix uses the Config::General Perl module, which accepts the option -SplitPolicy
, but Monitorix do not uses the option -SplitDelimiter
which is only used when you specify the value custom
in the -SplitPolicy
option.
You might want to modify your /usr/bin/monitorix
and monitorix.cgi
files in lines 519, 591, and 258, 271 respectively, by adding the option -SplitDelimiter
with a regular expression that could fit your case.
Let me know if that worked for you and, if so, I'll include the custom
value in the -s
option of Monitorix.
Hi! I have a hp p410i raid controller with 2 RAID1 arrays: SSD (/dev/sda 2500GB disks) and HDD (/dev/sdb 22TB disks). I tried to put them into the config file to monitor the temperature of each drive:
So, this is not working, as the monitorix doesn't want to recognise the devices
/dev/disk/by-path/pci-0000:02:00.0-scsi-0:1:0:0 -d cciss,N
This is working, except mapping:
So, is that possible to fix both issues?
Additionally, is that possible to add the support of hp raid controllers by utilising
ssacli
orhpacucli
hp specific utilities to monitor the state?