Meliox / PVE-mods

Proxmox modifications
365 stars 25 forks source link

Help Identifying temps pulled... #19

Closed bearhntr closed 6 months ago

bearhntr commented 9 months ago

Thanks for the script - and reading through your README - you had mentioned the DRIVETEMP KERNAL MODULE, no real instructions on making sure it was there - as I had installed lm-sensors but was not seeing anything that looked like HDD temps. My PVE box has one NVME drive via a card in a PCIe slot and an SSD drive attached to SATA port on motherboard.

After doing the steps here:
image

I am seeing this - but how do I know which is which for HDDs? Also - what are these other temps?

image

Meliox commented 9 months ago

To read HDD temperatures you need to use hwmon, which can be accessed using sensors when the drivetemp module is loaded.

sudo apt install lm-sensors sudo modprobe drivetemp sensors To make drivetemp load automatically every boot, add it to /etc/modules as follows:

echo drivetemp | sudo tee -a /etc/modules

To identify drives you could do the following sudo smartctl -A /dev/sd(your drive letter) and grep temperature.

Isa that's your cpu. Acpitz is the temperature sensor near/on your CPU socket. This sensor can be unreliable. Drivetemp-SCSI is HDD or SSD.

I do not see NVME mentioned, but maybe the pci simply makes it look like a SSD/HDD

bearhntr commented 9 months ago

Thanks for the response - see my replies below.

To read HDD temperatures you need to use hwmon, which can be accessed using sensors when the drivetemp module is loaded.

This does not install...Proxmox is basically DEBIAN at it's heart/core. I tried to install with apt install hwmon and with apt-get install hwmon -- both fail.

sudo apt install lm-sensors sudo modprobe drivetemp sensors To make drivetemp load automatically every boot, add it to /etc/modules as follows:

I tried this - and got the following (does not appear that I need 'sudo': image

echo drivetemp | sudo tee -a /etc/modules

To identify drives you could do the following sudo smartctl -A /dev/sd(your drive letter) and grep temperature.

Isa that's your cpu. Acpitz is the temperature sensor near/on your CPU socket. This sensor can be unreliable. Drivetemp-SCSI is HDD or SSD.

I do not see NVME mentioned, but maybe the pci simply makes it look like a SSD/HDD

sensors is telling me 2 HDDs

drivetemp-scsi-2-0 Adapter: SCSI adapter temp1: +22.0°C

drivetemp-scsi-2-0 Adapter: SCSI adapter temp1: +22.0°C

Now to figure out how to get those to show properly in the Proxmox dashboard.

bearhntr commented 9 months ago

I also have just tried to run the script - to see what would happen. Apparently, all I get is flashing cursor until I CTRL+C to get out:

image

I am also noticing that when I finally cancel the .sh --- there is a file (I am guessing it is a file - still cannot figure out why files and folders are not different colors) created in the /root called '{thermalstate}"' . When I open the file, there is nothing in it. Also the /root/backup folder does not appear to get created.

Meliox commented 9 months ago

Since sensors already show the HDD, I don't believe you need to install more.

You only need to run wget https://raw.githubusercontent.com/Meliox/PVE-mods/main/pve-mod-gui-temp.sh

Notice that your command contains quite much more.

bearhntr commented 9 months ago

wget https://raw.githubusercontent.com/Meliox/PVE-mods/main/pve-mod-gui-temp.sh

I get this:

image

...it looks to have saved a file ( ) in the root folder.

The command that I was using - I have used numerous times to pull down a .sh file and RUN it with one command, does that not work with your .sh files? How would I then run the .sh file which looks to be here now (I am sorry - if I seem 'daff', but I am not a Linux guru, and trying to learn as I go):

image

bearhntr commented 9 months ago

OK - I think .... MAYBE ... I figured it out.

image

I had already used the steps here to add a CPU temp. Should I 'un-do' these to use your script?
[https://www.reddit.com/r/homelab/comments/rhq56e/displaying_cpu_temperature_in_proxmox_summery_in/]

Meliox commented 9 months ago

Should be working now. Open the proxmox url and make force refresh, ctlr+f5 or clear the cache.

bearhntr commented 9 months ago

Should be working now. Open the proxmox url and make force refresh, ctlr+f5 or clear the cache.

I un-did all the changes I had manually done, and rebooted. Dashboard was back to OOB state. I then executed your .sh file, and low-and-behold...I got temps showing.

image

❓ Is it possible to make the window where all the data is displayed - into the full width of the window, and the graphs below it? Kinda like this (this is a created image using photoshop - to show desired view) -- Maybe even add stuff or stretch things out?

image

Meliox commented 9 months ago

Well anything is possible and I do agree your mockup does look better

bearhntr commented 9 months ago

Well anything is possible and I do agree your mockup does look better

I wish I understood more of the programming language used (whatever it is). I can kinda read the 'code' and figure out what most of it is.... (see image) I believe this is the "start' of the information box that is displaying all the information. Given that the file I am editting ends in .js I would think it is JavaScript...but not 100% sure.

Near the top it mentions 'minHeight:' - but, I see nothing which gives a width. No idea what 'flex:' and 'padding:' are or do either.

From my old days writing HTML, seems like there is a line needed to set the 'box' to be 100% (so that it adjusts as the window adjusts - if it is full-screen or in a window you want it to be full-width) - problem is, no idea how to call that. 😥

image

Meliox commented 9 months ago

If you want to play with it, you can get help from chatgpt :)

eremem commented 8 months ago

@bearhntr Feel free to check the latest version from https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh. It's not quite what you suggested but I hope you find it more readable especially when the summary is shown with 2/3-column layout.

@Meliox what do you you think of this new layout?

obraz

bearhntr commented 8 months ago

@eremem

@bearhntr Feel free to check the latest version from https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh. It's not quite what you suggested but I hope you find it more readable especially when the summary is shown with 2/3-column layout.

This is amazing work - thanks. I just installed on one of my PVE systems that has Core i5-3475S (old Dell Optiplex 7010). I took out the DVD and put in a cover there and it has 2 Kingston SSD (250GB each). I am thinking that 'sensors' is seeing the 2 ACPI as those (cannot think of anything else it would be seeing)...and the PCI Card is my M.2 WiFi/BT card in a PCIe convertor card. Guessing that is why the SSD temps are not shown?

image

*Notice there are no DRIVES listed:

image

Meliox commented 8 months ago

@bearhntr Feel free to check the latest version from https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh. It's not quite what you suggested but I hope you find it more readable especially when the summary is shown with 2/3-column layout.

@Meliox what do you you think of this new layout?

obraz

Looks great. I would put nvme and hdd on a new row, aligned with the cpu temp for consistency. So also have more than 4 drives :o

I was considering to add fan speeds as well... But for some reasons sensors is not reporting mine.

bearhntr commented 8 months ago

You guys are amazing. I wish I could read the code better and understand it all. I still think that the 'info' box (my second image) should be 100% width and the graphs in a 2x2 underneath it.

Meliox commented 8 months ago

@eremem

@bearhntr Feel free to check the latest version from https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh. It's not quite what you suggested but I hope you find it more readable especially when the summary is shown with 2/3-column layout.

This is amazing work - thanks. I just installed on one of my PVE systems that has Core i5-3475S (old Dell Optiplex 7010). I took out the DVD and put in a cover there and it has 2 Kingston SSD (250GB each). I am thinking that 'sensors' is seeing the 2 ACPI as those (cannot think of anything else it would be seeing)...and the PCI Card is my M.2 WiFi/BT card in a PCIe convertor card. Guessing that is why the SSD temps are not shown?

image

*Notice there are no DRIVES listed:

image

Have you tried sensors-detect? Just hit 'enter' for all the questions?

bearhntr commented 8 months ago

Have you tried sensors-detect? Just hit 'enter' for all the questions?

NVM-- I typed that at the Terminal command, and chose all defaults.

It added to the /etc/modules file, and a reboot now gives (below):

Chip drivers

coretemp

no change image

Meliox commented 8 months ago

The "acpitz-acpi-0" sensor could represent a temperature reading from a specific component on the motherboard, such as the chipset, voltage regulators, or another critical component. However, the exact meaning of this sensor can vary depending on the hardware configuration and the interpretation provided by the motherboard's BIOS or firmware.

So, i guess you're right about that. A scenario that we have not thought about..

https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh#L88C3-L88C63

If you modify that line from if (echo "$sensorOutput" | grep -q "drivetemp-scsi-" ); then

to

if (echo "$sensorOutput" | grep -q "acpitz-acpi-" ); then

same for https://github.com/Meliox/PVE-mods/blob/f2bc76d1b3d06708de738c207bc99aa1e5966e4f/pve-mod-gui-temp.sh#L89

and https://github.com/Meliox/PVE-mods/blob/f2bc76d1b3d06708de738c207bc99aa1e5966e4f/pve-mod-gui-temp.sh#L271

Then you should have them visualised....

@eremem: Should we make this configurable using a variable? I am hesitant to add it permanently given the ambiguous usage of acpitz-acpi-0.

bearhntr commented 8 months ago

I will use a BOOTABLE USB tool - which will give me a Windows way to see if I can figure out what they are.

bearhntr commented 8 months ago

Well the tool was of no real help - but when I looked at the temps for the 2x SSDs - they were 21°C and 23°C respectively. Thinking these must have been the drives...but still gonna do more investigating. As this tool also said that one drive had 2% life left, and the other 4%. LOL - these drives are about 2 years old...and one of them has been sitting in a box on my desk for about 9 months where I took it out of an old laptop that I donated.

eremem commented 8 months ago

You guys are amazing. I wish I could read the code better and understand it all. I still think that the 'info' box (my second image) should be 100% width and the graphs in a 2x2 underneath it.

@bearhntr This would be quite a challenge, since we had to move this single box into a new component and inject it before the existing one. We also had to check if it acts well with multiple columns and their forced number set in the user settings. I've just had a look at the code and given your idea a try. I managed to modify the code so that the summary looks like your mock-up. I'll try to adjust the installation script in the next few days.

eremem commented 8 months ago

Looks great. I would put nvme and hdd on a new row, aligned with the cpu temp for consistency. So also have more than 4 drives :o

@Meliox Splitting a line would be a mess for more drives and multiple columns. This could be a good solution if we tried to implement the layout @bearhntr suggested. But as long as the box has to share the horizontal space with other boxes, the content could quickly get squeezed quite badly.

I was considering to add fan speeds as well... But for some reasons sensors is not reporting mine.

I can's see them either. No idea if this is a problem of sensors or just a general hardware one.

eremem commented 8 months ago

The "acpitz-acpi-0" sensor could represent a temperature reading from a specific component on the motherboard, such as the chipset, voltage regulators, or another critical component. However, the exact meaning of this sensor can vary depending on the hardware configuration and the interpretation provided by the motherboard's BIOS or firmware.

So, i guess you're right about that. A scenario that we have not thought about..

@eremem: Should we make this configurable using a variable? I am hesitant to add it permanently given the ambiguous usage of acpitz-acpi-0.

@Meliox I wouldn't use it for storage/drive temperature display but maybe as a separate new section e.g. Misc. hardware components (or sth. shorter;)). Then we could enable them (or even group some) via a configuration variable.

eremem commented 8 months ago

@bearhntr Please check the new version (https://github.com/Meliox/PVE-mods/blob/integration-eremem/pve-mod-gui-temp.sh).

@Meliox I checked your suggestion about putting drives' temps into a single line, but I don't like the result: obraz Additionally, as anticipated, for narrower screens and multiple temperatures the lines break uglily.

Meliox commented 8 months ago

@eremem No you're right. Does not look good. Have both drive types on a new line and align temps to the right? Then we can make another "section" for others below as you suggested.

eremem commented 8 months ago

@eremem No you're right. Does not look good. Have both drive types on a new line and align temps to the right? Then we can make another "section" for others below as you suggested.

@Meliox Yes, this is the current state: obraz

This layout would indeed allow us to add further groups down below.