desbma / hddfancontrol

Regulate fan speed according to hard drive temperature
GNU General Public License v3.0
134 stars 17 forks source link

Spindown not working since update to Ubuntu 20.10 #28

Closed fightforlife closed 3 years ago

fightforlife commented 3 years ago

Hi desbma,

since I updated my host OS to Ubuntu 20.10 hddfancontrol is no longer spinning down my harddisks. (Dockerfile: https://github.com/fightforlife/docker_hddfancontrol/blob/313b51ce636978750240eb0e002f43edf50d2c4b/Dockerfile)

The log does not show any errors. Manual spindown with hdparm -y works just fine. And the drives stay spun down. (in host OS and inside docker)

Is there some way to get more output?

parameters -d /dev/sdc /dev/sdd -p /sys/class/hwmon/hwmon2/pwm1 /sys/class/hwmon/hwmon2/pwm2 --pwm-start-value 70 80 --pwm-stop-value 20 30 --min-fan-speed-prct 0 -i 60 --spin-down-time 180 --smartctl -v debug

Debug Log:

2020-11-08T10:39:56.731151989Z INFO [Main] Process real time scheduler set to 2, priority 49
2020-11-08T10:39:56.848071716Z WARNING [sdc ST33000651AS] Drive does not support HGST temp query
2020-11-08T10:39:56.940258317Z WARNING [sdd WDC WD15EADS-00S2B0] Drive does not support HGST temp query
2020-11-08T10:39:56.995973068Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:39:56.999278401Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:39:57.000891149Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:39:57.001012863Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:39:57.050839349Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:39:57.096365638Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:39:57.100590998Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:39:57.146842706Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:39:57.146893428Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:39:57.150316318Z INFO [Fan #1] Setting fan speed to 35%
2020-11-08T10:39:57.150368511Z DEBUG [Fan #1] Setting PWM value to 102
2020-11-08T10:39:57.150378507Z INFO [Fan #2] Setting fan speed to 35%
2020-11-08T10:39:57.150388706Z DEBUG [Fan #2] Setting PWM value to 108
2020-11-08T10:39:57.150397758Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:40:17.166299454Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:40:17.209172546Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:40:17.213295188Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:40:17.262557576Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:40:17.262610002Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:40:17.262622207Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:40:37.278619532Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:40:37.321303580Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:40:37.324306156Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:40:37.370055518Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:40:37.370216202Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:40:37.370267444Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:40:57.002550986Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:40:57.010515385Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:40:57.011671810Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:40:57.012796707Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:40:57.013114231Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:40:57.013552008Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:40:57.386533805Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:40:57.429378139Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:40:57.433537964Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:40:57.479923172Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:40:57.479985949Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:40:57.479996614Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:41:17.494335707Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:41:17.540009525Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:41:17.542856317Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:41:17.588344939Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:41:17.588397845Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:41:17.588408453Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:41:37.604338328Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:41:37.645913512Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:41:37.650122791Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:41:37.695545958Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:41:37.695618693Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:41:37.695630065Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:41:57.017528667Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:41:57.022557259Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:41:57.024487588Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:41:57.025177994Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:41:57.025529048Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:41:57.025918993Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:41:57.711081198Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:41:57.755596706Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:41:57.760228818Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:41:57.812040079Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:41:57.812093218Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:41:57.812128192Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:42:17.827433946Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:42:17.871233521Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:42:17.875410947Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:42:17.921954770Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:42:17.922012979Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:42:17.922030171Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:42:37.937348955Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:42:37.982335401Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:42:37.986362314Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:42:38.031648208Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:42:38.031702726Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:42:38.031732274Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:42:57.125461762Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:42:57.126775099Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:42:57.158600710Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:42:57.159159541Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:42:57.159506961Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:42:57.160001432Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:42:58.044300222Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:42:58.219541783Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:42:58.223678002Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:42:58.270037850Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:42:58.270091506Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:42:58.270102250Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:43:18.286115931Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:43:18.328918165Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:43:18.333178947Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:43:18.383999320Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:43:18.384053644Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:43:18.384095347Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:43:38.398341754Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:43:38.439907461Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:43:38.444119871Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:43:38.489937755Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:43:38.489989819Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:43:38.490000508Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:43:57.162268468Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:43:57.165824395Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:43:57.169624687Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:43:57.170313269Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:43:57.173826270Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:43:57.174268871Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:43:58.505701236Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:43:58.546551434Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:43:58.550497684Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:43:58.596268992Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:43:58.596321248Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:43:58.596399255Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:44:18.611862898Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:44:18.654560644Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:44:18.658752464Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:44:18.704310244Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:44:18.704393669Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:44:18.704410420Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:44:38.719626899Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:44:38.761283282Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:44:38.765449625Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:44:38.815031694Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:44:38.815085218Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:44:38.815095838Z DEBUG [Main] Sleeping for 20 seconds
2020-11-08T10:44:57.171245950Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:44:57.175732628Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:44:57.176429613Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:44:57.178522467Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:44:57.182420082Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:44:57.182740874Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:44:58.830845596Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:44:58.871228396Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:44:58.875362353Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:44:58.921146947Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:44:58.921201597Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:44:58.921241254Z DEBUG [Main] Sleeping for 60 seconds
2020-11-08T10:45:57.179920830Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:45:57.184412646Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:45:57.188353536Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:45:57.190924683Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:45:57.191682027Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:45:57.191729845Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:45:58.938542690Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:45:58.982111776Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:45:58.987065481Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:45:59.032751924Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:45:59.032805432Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:45:59.032821817Z DEBUG [Main] Sleeping for 60 seconds
2020-11-08T10:46:57.193877547Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:46:57.195583793Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:46:57.201502729Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:46:57.202126796Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:46:57.202329774Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:46:57.202843739Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:46:59.048316234Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:46:59.091168228Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:46:59.095283986Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:46:59.141204663Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:46:59.141258007Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:46:59.141282086Z DEBUG [Main] Sleeping for 60 seconds
2020-11-08T10:47:57.204133803Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Drive is active
2020-11-08T10:47:57.208861440Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Drive is active
2020-11-08T10:47:57.214029864Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:47:57.214562726Z DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds
2020-11-08T10:47:57.217182541Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:47:57.217236404Z DEBUG [DriveSpinDownThread-sdc ST33000651AS] Sleeping for 60 seconds
2020-11-08T10:47:59.157235202Z DEBUG [sdc ST33000651AS] Drive state: ACTIVE_IDLE
2020-11-08T10:47:59.200105011Z DEBUG [sdc ST33000651AS] Drive temperature: 37 °C
2020-11-08T10:47:59.204054689Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive state: ACTIVE_IDLE
2020-11-08T10:47:59.249254987Z DEBUG [sdd WDC WD15EADS-00S2B0] Drive temperature: 36 °C
2020-11-08T10:47:59.249319626Z INFO [Main] Maximum device temperature: 37 °C
2020-11-08T10:47:59.249332433Z DEBUG [Main] Sleeping for 60 seconds
fightforlife commented 3 years ago

I checked with dstat if the drives are accesses somehow. But the only activity I can see is the temperature query of hddfancontrol every 60seconds, which results in a read of about 1536B.

image

desbma commented 3 years ago

Temperature query should not generate a block read. Are you sure nothing else is active on the drive?

If you stop hddfancontrol and run for example: cat /sys/block/sdX/stat && sleep 60s && cat /sys/block/sdX/stat (replace sdX with your drive) do the numbers change?

fightforlife commented 3 years ago

here some logging from my system. I am not entirely sure if maybe this is correct behaviour. I did not test this before updating ubuntu (and kernel). I always see this in the hddfancontrol logs, which suggests that something is accessing the disk every 60 seconds. DEBUG [DriveSpinDownThread-sdd WDC WD15EADS-00S2B0] Sleeping for 60 seconds

If I change the intervall of hddfancontrol to -i 30 the message stays the same.

without hddfancontrol running:

root@server:~# cat /sys/block/sdd/stat && sleep 120s && cat /sys/block/sdd/stat
    1449      440    17462     3706        9        0        8       26        0     6524     3757        0        0        0        0        9       25
    1449      440    17462     3706        9        0        8       26        0     6524     3757        0        0        0        0        9       25

image

with hddfancontrol running:

root@server:~# cat /sys/block/sdd/stat && sleep 120s && cat /sys/block/sdd/stat
    1461      440    17469     3760        9        0        8       26        0     6612     3811        0        0        0        0        9       25
    1493      440    17487     3790        9        0        8       26        0     6724     3842        0        0        0        0        9       25

image

root@server:~# btrace /dev/sdd
  8,48   0        1     0.000000000     0  C   N [0]
  8,48   1        2     0.028244561 169627  D   R 36 [smartctl]
  8,48   0        2     0.030686854     0  C   R [0]
  8,48   0        3     0.033403673     0  C   R [0]
  8,48   0        4     0.047476360     0  C   R [0]
  8,48   1        3     0.028405730    18  C   R [0]
  8,48   1        4     0.028449196 169627  D   R 512 [smartctl]
  8,48   1        5     0.030722189 169627  D   R 512 [smartctl]
  8,48   1        6     0.047315318 169627  D   R 512 [smartctl]
  8,48   2        1    19.571470432 169874  D   N 0 [hdparm]
  8,48   2        2    19.571543611     0  C   N [0]
  8,48   1        7    20.167686649 169890  D   N 0 [hdparm]
  8,48   1        8    20.194338328 169891  D   R 36 [smartctl]
  8,48   1        9    20.194394050    18  C   R [0]
  8,48   1       10    20.194446125 169891  D   R 512 [smartctl]
  8,48   0        5    20.167774548     0  C   N [0]
  8,48   0        6    20.196634341     0  C   R [0]
  8,48   0        7    20.199385451     0  C   R [0]
  8,48   1       11    20.196681188 169891  D   R 512 [smartctl]
  8,48   1       12    20.213916071 169891  D   R 512 [smartctl]
  8,48   0        8    20.214151593     0  C   R [0]
  8,48   0        9    40.330721033     0  C   N [0]
  8,48   1       13    40.330630653 170148  D   N 0 [hdparm]
  8,48   1       14    40.356608336 170149  D   R 36 [smartctl]
  8,48   1       15    40.356653048    18  C   R [0]
  8,48   1       16    40.356725513 170149  D   R 512 [smartctl]
  8,48   1       17    40.358984155 170149  D   R 512 [smartctl]
  8,48   1       18    40.375311307 170149  D   R 512 [smartctl]
  8,48   0       10    40.358926490     0  C   R [0]
  8,48   0       11    40.361370133     0  C   R [0]
  8,48   0       12    40.375447699 169355  C   R [0]
  8,48   0       13    60.500808114  8038  C   N [0]
  8,48   0       14    60.529797504     0  C   R [0]
  8,48   1       19    60.500747197 170415  D   N 0 [hdparm]
  8,48   1       20    60.527504746 170416  D   R 36 [smartctl]
  8,48   1       21    60.527569436    18  C   R [0]
  8,48   1       22    60.527629047 170416  D   R 512 [smartctl]
  8,48   1       23    60.529836574 170416  D   R 512 [smartctl]
  8,48   0       15    60.532001304     0  C   R [0]
  8,48   0       16    60.545938362     0  C   R [0]
  8,48   1       24    60.545784720 170416  D   R 512 [smartctl]
  8,48   2        3    79.579052953 170650  D   N 0 [hdparm]
  8,48   2        4    79.579134188  3912  C   N [0]
  8,48   0       17    80.671866943     0  C   N [0]
  8,48   0       18    80.701097117     0  C   R [0]
  8,48   0       19    80.703486162     0  C   R [0]
  8,48   1       25    80.671777456 170675  D   N 0 [hdparm]
  8,48   1       26    80.698668687 170676  D   R 36 [smartctl]
  8,48   1       27    80.698748570    18  C   R [0]
  8,48   1       28    80.698808516 170676  D   R 512 [smartctl]
  8,48   1       29    80.701295648 170676  D   R 512 [smartctl]
  8,48   1       30    80.717469180 170676  D   R 512 [smartctl]
  8,48   0       20    80.717605918 169355  C   R [0]
  8,48   0       21   100.832863461     0  C   N [0]
  8,48   1       31   100.832774925 170935  D   N 0 [hdparm]
  8,48   1       32   100.858927528 170936  D   R 36 [smartctl]
  8,48   1       33   100.859013905    18  C   R [0]
  8,48   1       34   100.859052413 170936  D   R 512 [smartctl]
  8,48   1       35   100.861279256 170936  D   R 512 [smartctl]
  8,48   1       36   100.877603613 170936  D   R 512 [smartctl]
  8,48   0       22   100.861234274     0  C   R [0]
  8,48   0       23   100.863422897     0  C   R [0]
  8,48   0       24   100.877727331 169356  C   R [0]
  8,48   0       25   121.007660035  2554  C   N [0]
  8,48   1       37   121.007601521 171203  D   N 0 [hdparm]
  8,48   1       38   121.034472111 171204  D   R 36 [smartctl]
  8,48   1       39   121.034503352    18  C   R [0]
  8,48   1       40   121.034558205 171204  D   R 512 [smartctl]
  8,48   1       41   121.036786299 171204  D   R 512 [smartctl]
  8,48   0       26   121.036741414     0  C   R [0]
  8,48   0       27   121.038983742     0  C   R [0]
  8,48   0       28   121.052909237 169356  C   R [0]
  8,48   1       42   121.052781713 171204  D   R 512 [smartctl]
fightforlife commented 3 years ago

Another note: If I put the drives manually to standby it is correctly reflected in the hddfancontrol logs and the fans are turned off. Also the drives are not woken by hddfancontrol.

hdparm -y /dev/sdd DEBUG [DriveSpinDownThread-sdb ST4000VN008-2DR166] Drive is already sleeping

desbma commented 3 years ago

What about: sudo bash -c 'cat /sys/block/sdd/stat && hddtemp /dev/sdd && cat /sys/block/sdd/stat' ?

fightforlife commented 3 years ago

Screenshots of multiple times sudo bash -c 'cat /sys/block/sdd/stat && hddtemp /dev/sdd && cat /sys/block/sdd/stat' image image

running hddfancontrol without --smartctl option:

root@server:~# btrace /dev/sdd
  8,48   0        1     0.000000000     0  C   N [0]
  8,48   1        2     0.003990779 3145379  D   R 36 [hddtemp]
  8,48   0        2     0.006420585     0  C   R [0]
  8,48   0        3     0.009143825     0  C   R [0]
  8,48   0        4     0.010301443  1428  C   N [0]
  8,48   1        3     0.004075500    18  C   R [0]
  8,48   1        4     0.004104563 3145379  D   R 512 [hddtemp]
  8,48   1        5     0.006587399 3145379  D   R 512 [hddtemp]
  8,48   1        6     0.010245333 3145379  D   N 0 [hddtemp]
  8,48   1        7     0.010328115 3145379  D   N 0 [hddtemp]
  8,48   0        5     0.077530156     0  C   N [0]
  8,48   0        6     0.080543971     0  C   R [0]
  8,48   1        8     0.077689622 3145379  D   R 512 [hddtemp]
  8,48   0        7    20.207323736     0  C   N [0]
  8,48   0        8    20.213504910     0  C   R [0]
  8,48   0        9    20.216026674 19555  C   R [0]
  8,48   0       10    20.216978815 19555  C   N [0]
  8,48   1        9    20.207230009 3145620  D   N 0 [hdparm]
  8,48   1       10    20.211256334 3145621  D   R 36 [hddtemp]
  8,48   1       11    20.211293309    18  C   R [0]
  8,48   1       12    20.211318473 3145621  D   R 512 [hddtemp]
  8,48   1       13    20.213526871 3145621  D   R 512 [hddtemp]
  8,48   1       14    20.216927578 3145621  D   N 0 [hddtemp]
  8,48   1       15    20.217001193 3145621  D   N 0 [hddtemp]
  8,48   0       11    20.288205074     0  C   N [0]
  8,48   1       16    20.288342937 3145621  D   R 512 [hddtemp]
  8,48   0       12    20.291186482     0  C   R [0]
  8,48   0       13    39.659237469 3145814  D   N 0 [hdparm]
  8,48   0       14    39.659343031 3144966  C   N [0]
  8,48   0       15    40.414804672     0  C   N [0]
  8,48   0       16    40.420914075     0  C   R [0]
  8,48   1       17    40.414711634 3145827  D   N 0 [hdparm]
  8,48   1       18    40.418557569 3145828  D   R 36 [hddtemp]
  8,48   1       19    40.418594349    18  C   R [0]
  8,48   1       20    40.418619302 3145828  D   R 512 [hddtemp]
  8,48   1       21    40.421068537 3145828  D   R 512 [hddtemp]
  8,48   1       22    40.424376384 3145828  D   N 0 [hddtemp]
  8,48   1       23    40.424495107 3145828  D   N 0 [hddtemp]
  8,48   0       17    40.423254745     0  C   R [0]
  8,48   0       18    40.424465017     0  C   N [0]
  8,48   0       19    40.498814242     0  C   N [0]
  8,48   0       20    40.501761538     0  C   R [0]
  8,48   1       24    40.498960431 3145828  D   R 512 [hddtemp]
  8,48   0       21    60.622425465     0  C   N [0]
  8,48   0       22    60.628529334 2307245  C   R [0]
  8,48   0       23    60.631081816 2306458  C   R [0]
  8,48   0       24    60.632297066     0  C   N [0]
  8,48   1       25    60.622331134 3146010  D   N 0 [hdparm]
  8,48   1       26    60.626305016 3146011  D   R 36 [hddtemp]
  8,48   1       27    60.626342886    18  C   R [0]
  8,48   1       28    60.626368543 3146011  D   R 512 [hddtemp]
  8,48   1       29    60.628552626 3146011  D   R 512 [hddtemp]
  8,48   1       30    60.632208437 3146011  D   N 0 [hddtemp]
  8,48   1       31    60.632328925 3146011  D   N 0 [hddtemp]
  8,48   0       25    60.698264729     0  C   N [0]
  8,48   0       26    60.701114564     0  C   R [0]
  8,48   1       32    60.698374321 3146011  D   R 512 [hddtemp]
  8,48   0       27    80.821881661 3131762  C   N [0]
  8,48   0       28    80.828157622     0  C   R [0]
  8,48   0       29    80.831130928     0  C   R [0]
  8,48   0       30    80.832047528 3131750  C   N [0]
  8,48   1       33    80.821819985 3146248  D   N 0 [hdparm]
  8,48   1       34    80.825849798 3146249  D   R 36 [hddtemp]
  8,48   1       35    80.825894612    18  C   R [0]
  8,48   1       36    80.825921007 3146249  D   R 512 [hddtemp]
  8,48   1       37    80.828270805 3146249  D   R 512 [hddtemp]
  8,48   1       38    80.831992228 3146249  D   N 0 [hddtemp]
  8,48   1       39    80.832073449 3146249  D   N 0 [hddtemp]
  8,48   0       31    80.897761485     0  C   N [0]
  8,48   0       32    80.900751409     0  C   R [0]
  8,48   1       40    80.897895132 3146249  D   R 512 [hddtemp]
desbma commented 3 years ago

If you did not have hddfancontrol running while executing sudo bash -c 'cat /sys/block/sdd/stat && hddtemp /dev/sdd && cat /sys/block/sdd/stat' and had that output, it means something is generating periodic disk reads which would prevent the drive from spinning down.

That something can either be:

fightforlife commented 3 years ago

In the comment above I entered the commands around every 5 seconds. That's why it looks like periodic. I can provide such a spike with every hddtemp or smartctl request. The last btrace command (with hddfancontrol running) shows access by hddtemp. The btrace command before that shows access by smartctl.

So my guess is not that something changed in smartctl or hddtemp, but in the way /sys/block/*/stat works.

I will try to investigate this further.

fightforlife commented 3 years ago

I am currently using https://github.com/docker-hotio/docker-hd-idle for spindown instead of hddfancontrol and it is working fine. After multiple tests I am a hundred percent sure that the following commands are causing the small read spikes.

hddtemp /dev/sd*
smartctl -a /dev/sd* | grep Temp | cut -d" " -f 2,37

There were some changes in the block layer between kernel 5.4 to 5.8. but I currently cannot say if this could be a cause.

mindrunner commented 3 years ago

I have the same issue after upgrading my Arch some days ago. I am using LTS kernel and it got upgraded to 5.10 (I guess it was 5.4 before.)

mindrunner commented 3 years ago

Can confirm that calling smartctl prevents the idle timer to increase :(

desbma commented 3 years ago

Indeed my main NAS is also on Arch, and the 5.10 LTS kernel update broke the auto spin down.

If anyone has an idea on how to probe for disk activity by evading the hddtemp/smartctl interferences, I'm open to new approaches, until then there is not much I can do to fix this.

desbma commented 3 years ago

Well actually I have an idea that might work with little changes.

desbma commented 3 years ago

@fightforlife @mindrunner Can you please post the exact output of sudo bash -c 'cat /sys/block/sdX/stat && hddtemp /dev/sdX && cat /sys/block/sdX/stat' (replace sdX by a hdd drive).

EDIT: The command should be run when nothing is generating activity on the drive.

fightforlife commented 3 years ago
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368700    14189 1667115340 13385382   115556    13373 138957136  4019257        0  8704136 17488349        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368705    14189 1667115343 13385427   115556    13373 138957136  4019257        0  8704188 17488394        0        0        0        0     3009    83709
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368705    14189 1667115343 13385427   115556    13373 138957136  4019257        0  8704188 17488394        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368710    14189 1667115346 13385473   115556    13373 138957136  4019257        0  8704240 17488440        0        0        0        0     3009    83709
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368710    14189 1667115346 13385473   115556    13373 138957136  4019257        0  8704240 17488440        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368715    14189 1667115349 13385531   115556    13373 138957136  4019257        0  8704304 17488498        0        0        0        0     3009    83709
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368715    14189 1667115349 13385531   115556    13373 138957136  4019257        0  8704304 17488498        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368720    14189 1667115352 13385576   115556    13373 138957136  4019257        0  8704352 17488543        0        0        0        0     3009    83709
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368720    14189 1667115352 13385576   115556    13373 138957136  4019257        0  8704352 17488543        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368725    14189 1667115355 13385622   115556    13373 138957136  4019257        0  8704404 17488589        0        0        0        0     3009    83709
root@server:~# bash -c 'cat /sys/block/sdc/stat && hddtemp /dev/sdc && cat /sys/block/sdc/stat'
 3368725    14189 1667115355 13385622   115556    13373 138957136  4019257        0  8704404 17488589        0        0        0        0     3009    83709
/dev/sdc: ST33000651AS: 31°C
 3368730    14189 1667115358 13385665   115556    13373 138957136  4019257        0  8704452 17488632        0        0        0        0     3009    83709
root@server:~#
desbma commented 3 years ago

I am currently testing a fix on the master branch, but it now seems to properly detect inactivity.

desbma commented 3 years ago

Fix released in version 1.4.0.

Inactive drives should be properly detected, unless smartctl is used for temperature probing (not the default) and SCT temp query is not supported.

mindrunner commented 3 years ago

Does anyone know if this issue is handled upstream somewhere? I did not test the workaround, yet. Unfortunately, I monitor the system temps with smartctl which also prevents the drive spindown with more recent kernels. Downgraded to 5.4 for now :(

desbma commented 3 years ago

@mindrunner
It should work fine if hddfancontrol is probing temperature with any of the following methods:

In those cases, probing generates a fixed and predictable (in my experience with an admittedly limited amount of drive models) number of read operations, so hddfancontrol can count the number of temp probes, the expected counter difference, and so detect drive inactivity reliably. However if you explicitly choose to use smartctl for probing AND your drive does not support SCT temp probing, then all SMART attributes are read to try to extract temperature and that generates a highly variable number of read operations depending on the exact drive model, which makes hddfancontrol unable to reliably detect drive activity/inactivity, and trigger spin down accordingly.

mindrunner commented 3 years ago

Thanks for the detailed information. On top of hddfancontrol I am regularly running some scripts, probing drive temperature, idle state and smart data from the drives for monitoring purposes.

Before the kernel update, this was working fine in combination with hddfancontrol. (drives spinning down). Will running those scripts prevent hddfancontrol spinning down the drives now?

desbma commented 3 years ago

Will running those scripts prevent hddfancontrol spinning down the drives now?

Most likely, yes.

mindrunner commented 3 years ago

:(

I know, this is sort of off topic for hddfancontrol. But I really consider that as a bug/regression, since it was perfectly fine before. Does anyone know which part of the kernel is responsible for this? Maybe we can continue the discussion over there?

desbma commented 3 years ago

Does anyone know which part of the kernel is responsible for this? Maybe we can continue the discussion over there?

It should be related to the block layer. You can try to contact the maintainers of that part of the kernel, and/or try to track the commit that caused this change. I personally don't have enough time to do this, but I'd be curious to learn the rationale for the change.

mindrunner commented 3 years ago

I was browsing the kernel bugzilla yesterday, but could not find anything helpful. I guess I would need to bisect the kernel sources to figure out. In https://github.com/adelolmo/hd-idle/issues/38 someone suggests, to use partitions instead of device names to query activity to get around this issue. Would it be possible to run hddfancontrol with HDDFANCONTROL_ARGS="-d /dev/sda1 /dev/sdb1" instead of HDDFANCONTROL_ARGS="-d /dev/sda /dev/sdb"? Or does that not make any sense?

desbma commented 3 years ago

Stat counters are per block device in the kernel, not per partition, and probing the partition device will probe the same device in effect, so I don't think that could allow you to work around this issue.

desbma commented 3 years ago

I guess I would need to bisect the kernel sources to figure out.

That would be the way to go, if you can automate the test to check if a commit is affected or not. You are also more likely to get attention from a kernel developer if you have precise information about the commit that caused the regression.

mindrunner commented 3 years ago

Stat counters are per block device in the kernel, not per partition, and probing the partition device will probe the same device in effect, so I don't think that could allow you to work around this issue.

Looking at https://github.com/adelolmo, I cannot confirm your statement. The most recent version is able to detect my drives being idle on most recent kernel versions despite hddfancontrol and other periodic smartctl checks are enabled. Unfortunately, hddfancontrol is not able to spin down the drives on my system. So for now I need both hddfancontrol for controlling the hdd-fans and hd-idle for the disk spin-down when idle.

desbma commented 3 years ago

You are right and I was wrong, there are indeed per partition counters, and temperature probing affects only the device counters.

However passing partitions can be confusing because it can lead to spinning down a drive that is currently active on another partition. I guess supporting partitions, but printing a warning would be a good compromise.

desbma commented 3 years ago

@mindrunner Please test 9f4a0513528d5bb4a625c9792299e76a49d109ce

mindrunner commented 3 years ago

I have never run this from sources and I do not see a hddfancontrol-git package on arch. Can you give me brief instructions on how to test? Can I just do a pip install and then run it from the src-dir?

desbma commented 3 years ago

The safest way to test without affecting the rest of your system would be:

python3 -m venv ~/venv-hddfancontrol
. ~/venv-hddfancontrol/bin/activate
git clone https://github.com/desbma/hddfancontrol/  ~/hddfancontrol
cd ~/hddfancontrol
pip3 install -r requirements.txt
python3 setup.py install
sudo ~/venv-hddfancontrol/bin/hddfancontrol ...

Once you are done, you can remove ~/venv-hddfancontrol and ~/hddfancontrol, nothing else has been changed on your system.

mindrunner commented 3 years ago

Awesome, will give it a go.

Probably worth adding that to the readme ;)

mindrunner commented 3 years ago

confirming that:

sudo ~/venv-hddfancontrol/bin/hddfancontrol -d /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 -p /sys/class/hwmon/hwmon3/device/pwm2 /sys/class/hwmon/hwmon3/device/pwm3 --pwm-start-value 89 89 --pwm-stop-value 84 84 --min-fan-speed-prct 0 -i 60 --min-temp 35 --max-temp 70 --spin-down-time 1800

indeed spins down the drives after 20 minutes, whereas

sudo ~/venv-hddfancontrol/bin/hddfancontrol -d /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf -p /sys/class/hwmon/hwmon3/device/pwm2 /sys/class/hwmon/hwmon3/device/pwm3 --pwm-start-value 89 89 --pwm-stop-value 84 84 --min-fan-speed-prct 0 -i 60 --min-temp 35 --max-temp 70 --spin-down-time 1800

does not. So it definitely makes a difference and would solve the issue

Also, seeing those warnings:

2021-04-07 15:14:24,399 WARNING [sda WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:24,468 WARNING [sda WDC WD60EFRX-xxx] '/dev/sda1' is a partition, parent device '/dev/sda' will be used except for activity stats
2021-04-07 15:14:24,569 WARNING [sdb WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:24,624 WARNING [sdb WDC WD60EFRX-xxx] '/dev/sdb1' is a partition, parent device '/dev/sdb' will be used except for activity stats
2021-04-07 15:14:24,719 WARNING [sdc WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:24,779 WARNING [sdc WDC WD60EFRX-xxx] '/dev/sdc1' is a partition, parent device '/dev/sdc' will be used except for activity stats
2021-04-07 15:14:24,879 WARNING [sdd WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:24,948 WARNING [sdd WDC WD60EFRX-xxx] '/dev/sdd1' is a partition, parent device '/dev/sdd' will be used except for activity stats
2021-04-07 15:14:25,059 WARNING [sde WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:25,115 WARNING [sde WDC WD60EFRX-xxx] '/dev/sde1' is a partition, parent device '/dev/sde' will be used except for activity stats
2021-04-07 15:14:25,179 WARNING [sdf WDC WD60EFRX-xxx] Drive does not support HGST temp query
2021-04-07 15:14:25,285 WARNING [sdf WDC WD60EFRX-xxx] '/dev/sdf1' is a partition, parent device '/dev/sdf' will be used except for activity stats
desbma commented 3 years ago

Probably worth adding that to the readme ;)

I could add that to the REAME... just as I could explain how to install Python, partition a hard drive or buy a computer. These instructions are nothing specific to this project.

Anyway thanks for confirming the fix, I will make a new release soon.

mindrunner commented 3 years ago

In many years I spend a lot of time in figuring out how to build and run projects I am supposed to work on w/o having the proper documentation. IMHO, documentation is same important as clean and readable code and having a how-to section for a newbie developer on how to get started is a very very nice thing to have, even if it is as straight forward as it is in this case.

Just my 2 cents.

Thanks for fixing.

desbma commented 3 years ago

@mindrunner I agree with you doc is important that is why there is a section in the README on how to install from source.

The snippet I gave you is how to install hddfancontrol as non root, alongside an existing installation. There is no way I can document every single possible installation use case, especially if this is nothing specific to this project, which by the way targets technical users.

Technology works by stacking solutions and hiding complexity behind abstractions. This project relies on Python, Linux, x86, and probably thousands of stuff I don't even know of. It is not feasible to document everything.

mindrunner commented 3 years ago

ha, I was pretty sure I checked and didn't see any instructions. Thus, my question about how to install from source... My bad! Sorry for the misunderstanding :)