Closed tomekNZ closed 5 years ago
Could you add "set -x" as the second line in the rrd.sh script, and then run it manually (copy and paste the command and argument from the crib job) and post the output?
Closing this as no repsonse.
I had this same issue. I hit it with a hammer though, since I didn't know what else to do :) I changed:
DevTemp=/usr/local/sbin/smartctl -a /dev/$i | awk '/194 Temperature_Celsius/{print $0}' | awk '{print $10}'
;
to:
if [[ $i == ada ]]; then
DevTemp=/usr/local/sbin/smartctl -a /dev/$i | awk '/194 Temperature_Celsius/{print $0}' | awk '{print $10}'
;
fi
if [[ $i == da ]]; then
DevTemp=/usr/local/sbin/smartctl -a /dev/$i | awk '/Current Drive Temperature/{print $0}' | awk '{print $4}'
;
fi
It seems to work fine, I have no idea if that's the right thing to do, but temperatures are reported differently on SAS drives, so this is my workaround.
Thanks for the report. I'll see about adding this to the script.
I think this should be working with the latest code. Please re-open or file a new issue if that's not the case.
I just tried it out, it doesn't seem to work because SAS drives doesn't return SMART values in the same way as SATA does.
Output from smartctl -a /dev/da0:
=== START OF INFORMATION SECTION === Vendor: IBM-ESXS Product: ST3000NM0023 Revision: BC5B Compliance: SPC-4 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Logical block size: 512 bytes LU is fully provisioned Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50062d34d13 Serial number: Z1Z6DEYY0000C5186Z4N Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Wed Mar 28 03:49:15 2018 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION === SMART Health Status: OK
Current Drive Temperature: 48 C Drive Trip Temperature: 65 C
Elements in grown defect list: 0
Vendor (Seagate) cache information Blocks sent to initiator = 0
Vendor (Seagate/Hitachi) factory information number of hours powered up = 1903.63 number of minutes until next internal SMART test = 46
Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 2018029810 0 0 2018029810 0 12590.850 0 write: 0 0 0 0 0 47332.691 0 verify: 210728676 0 0 210728676 0 61.686 0
Non-medium error count: 24
SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours)
Long (extended) Self Test duration: 26000 seconds [433.3 minutes]
Thanks for the follow-up. At the moment I don't have time to do much work on this tool and the string parsing, but am happy to review and merge a pull-request if you want to try.
I won't pretend to be a programmer, so that would probably end badly :) If you don't have any SAS drives, I'll be happy to test for you when you get the time.
I have created a work around for my SAS disks, you might want to take a look at it.
inserted on line 36:
else
DevTemp=$(/usr/local/sbin/smartctl -a /dev/"${i}" | grep "Current Drive Temperature:" |awk '{print $4}')
if [ -n "$DevTemp" ]; then
drivedevs="${drivedevs} ${i}"
fi
and added it also on the get_tempuratures ()
else
DevTemp=$(/usr/local/sbin/smartctl -a /dev/"${i}" | grep "Current Drive Temperature:" |awk '{print $4}')
if [ -n "$DevTemp" ]; then
data="${data}${sep}${DevTemp}"
fi
There seems to be a common line by the "Current Drive Temperature:" line for a lot of SAS disks. I'm not a programmer but know my why around scripts, so hope this is good enough for you to incorporate in the script
Created a pull request for it
Fixed by https://github.com/seren/freenas-temperature-graphing/pull/20. Thanks for the PR, COW-Koetje.
HI there. I've installed the tool and it is working great with one exception, i have three 12Gb/s SAS drives connected to LSI SAS controller which are not showing on the temp graph. Can they be added?