Draggon / hassio-hdd-tools

27 stars 13 forks source link

Home Assistant "sensor.hdd_temp" has badly formatted value. #3

Closed oester closed 3 years ago

oester commented 3 years ago

The current hdd_sensor temp is formatted with a value similar to "23/35)" which is confusing. These represent the min/max temperatures detected, and the trailing ")" is irrelevant. Change the output capture to not take the last element of the line, but the 3rd from the last:

pi@pi4-2:~ $ sudo /usr/sbin/smartctl -a /dev/sda | grep -i temp
194 Temperature_Celsius     0x0022   028   030   000    Old_age   Always       -       28 (Min/Max 28/30)
carsten-h commented 3 years ago

Hello!

I also have this strange value:

[Tue Sep 29 09:51:00 CEST 2020][Info] Init run
[Tue Sep 29 09:51:01 CEST 2020][Info] Sensor value: 0)°
[Tue Sep 29 09:51:01 CEST 2020][Info] Sensor update response code: 201

Bildschirmfoto 2020-09-29 um 09 57 57

boesing commented 3 years ago

I've created a PR which is already merged by @Draggon. You might check the README as you can now configure the attribute you want to use to get the attributes from. The temperature should be parsed properly even without changing the configuration.

carsten-h commented 3 years ago

The temperature should be parsed properly even without changing the configuration.

Yes, it is working now! Thank you very much.

You might check the README as you can now configure the attribute you want to use to get the attributes from.

I read it but I don't understand it. I don't see any properties besides the temperature in the log file. Maybe later...

boesing commented 3 years ago

check the file which is configured as output_file (dont forget to activate debug). that file contains a json object with all smartctl informations. there supposed to be one property which contains some attributes.

you can change configuration in hass.io within the addon page.

for me, for example, it was a property called nvme_..._informations.

carsten-h commented 3 years ago

check the file which is configured as output_file

That was the thing I didn't read correctly. I don't have to look into the log of the integration, I have to look into this file which is in /share/hdd_tools. Thank you, now it's clear!

carsten-h commented 3 years ago

It was only nearly clear.

My attributes starting with "ata_smart_attributes"

I have one property I want to look after:

      {
        "id": 202,
        "name": "Percent_Lifetime_Remain",
        "value": 100,
        "worst": 100,
        "thresh": 1,
        "when_failed": "",
        "flags": {
          "value": 48,
          "string": "----CK ",
          "prefailure": false,
          "updated_online": false,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 100,
          "string": "100"
        }
      },

What do I have to write into the configuration? Only "Percent_Lifetime_Remain" is not working.

boesing commented 3 years ago

Ah I see, could you post your file content here (feel free to strip out stuff you dont want to expose here) and the list of attributes you want to have in your sensor.

The attribute you have to add has to be a JSON object as it is being merged with the attributes from your sensor.

But if I understand your message correct, your property ata_smart_attributes is a list of objects, right?

carsten-h commented 3 years ago

The complete file is this:

{
  "json_format_version": [
    1,
    0
  ],
  "smartctl": {
    "version": [
      7,
      1
    ],
    "svn_revision": "5022",
    "platform_info": "aarch64-linux-5.4.74-v8",
    "build_info": "(local build)",
    "argv": [
      "smartctl",
      "-a",
      "--json",
      "/dev/sda"
    ],
    "exit_status": 32
  },
  "device": {
    "name": "/dev/sda",
    "info_name": "/dev/sda [SAT]",
    "type": "sat",
    "protocol": "ATA"
  },
  "model_family": "Crucial/Micron BX/MX1/2/3/500, M5/600, 1100 SSDs",
  "model_name": "CT120BX500SSD1",
  "serial_number": "2xxxxxxx6",
  "wwn": {
    "naa": 0,
    "oui": 0,
    "id": 0
  },
  "firmware_version": "M6CR013",
  "user_capacity": {
    "blocks": 234441648,
    "bytes": 120034123776
  },
  "logical_block_size": 512,
  "physical_block_size": 512,
  "rotation_rate": 0,
  "form_factor": {
    "ata_value": 3,
    "name": "2.5 inches"
  },
  "in_smartctl_database": true,
  "ata_version": {
    "string": "ACS-2 T13/2015-D revision 3",
    "major_value": 1008,
    "minor_value": 272
  },
  "sata_version": {
    "string": "SATA 3.2",
    "value": 255
  },
  "interface_speed": {
    "max": {
      "sata_value": 14,
      "string": "6.0 Gb/s",
      "units_per_second": 60,
      "bits_per_unit": 100000000
    },
    "current": {
      "sata_value": 3,
      "string": "6.0 Gb/s",
      "units_per_second": 60,
      "bits_per_unit": 100000000
    }
  },
  "local_time": {
    "time_t": 1606639501,
    "asctime": "Sun Nov 29 09:45:01 2020 CET"
  },
  "smart_status": {
    "passed": true
  },
  "ata_smart_data": {
    "offline_data_collection": {
      "status": {
        "value": 0,
        "string": "was never started"
      },
      "completion_seconds": 120
    },
    "self_test": {
      "status": {
        "value": 0,
        "string": "completed without error",
        "passed": true
      },
      "polling_minutes": {
        "short": 2,
        "extended": 10
      }
    },
    "capabilities": {
      "values": [
        17,
        2
      ],
      "exec_offline_immediate_supported": true,
      "offline_is_aborted_upon_new_cmd": false,
      "offline_surface_scan_supported": false,
      "self_tests_supported": true,
      "conveyance_self_test_supported": false,
      "selective_self_test_supported": false,
      "attribute_autosave_enabled": false,
      "error_logging_supported": true,
      "gp_logging_supported": true
    }
  },
  "ata_smart_attributes": {
    "revision": 1,
    "table": [
      {
        "id": 1,
        "name": "Raw_Read_Error_Rate",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 47,
          "string": "POSR-K ",
          "prefailure": true,
          "updated_online": true,
          "performance": true,
          "error_rate": true,
          "event_count": false,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 5,
        "name": "Reallocate_NAND_Blk_Cnt",
        "value": 100,
        "worst": 100,
        "thresh": 10,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 9,
        "name": "Power_On_Hours",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 744,
          "string": "744"
        }
      },
      {
        "id": 12,
        "name": "Power_Cycle_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 35,
          "string": "35"
        }
      },
      {
        "id": 171,
        "name": "Program_Fail_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 172,
        "name": "Erase_Fail_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 173,
        "name": "Ave_Block-Erase_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 9,
          "string": "9"
        }
      },
      {
        "id": 174,
        "name": "Unexpect_Power_Loss_Ct",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 24,
          "string": "24"
        }
      },
      {
        "id": 180,
        "name": "Unused_Reserve_NAND_Blk",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 100,
          "string": "100"
        }
      },
      {
        "id": 183,
        "name": "SATA_Interfac_Downshift",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 184,
        "name": "Error_Correction_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 187,
        "name": "Reported_Uncorrect",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 194,
        "name": "Temperature_Celsius",
        "value": 64,
        "worst": 48,
        "thresh": 50,
        "when_failed": "past",
        "flags": {
          "value": 34,
          "string": "-O---K ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": false,
          "auto_keep": true
        },
        "raw": {
          "value": 223340199972,
          "string": "36 (Min/Max 29/52)"
        }
      },
      {
        "id": 196,
        "name": "Reallocated_Event_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 197,
        "name": "Current_Pending_Sector",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 198,
        "name": "Offline_Uncorrectable",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 48,
          "string": "----CK ",
          "prefailure": false,
          "updated_online": false,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 199,
        "name": "UDMA_CRC_Error_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 29,
          "string": "29"
        }
      },
      {
        "id": 202,
        "name": "Percent_Lifetime_Remain",
        "value": 100,
        "worst": 100,
        "thresh": 1,
        "when_failed": "",
        "flags": {
          "value": 48,
          "string": "----CK ",
          "prefailure": false,
          "updated_online": false,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 100,
          "string": "100"
        }
      },
      {
        "id": 206,
        "name": "Write_Error_Rate",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 46,
          "string": "-OSR-K ",
          "prefailure": false,
          "updated_online": true,
          "performance": true,
          "error_rate": true,
          "event_count": false,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 210,
        "name": "Success_RAIN_Recov_Cnt",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 0,
          "string": "0"
        }
      },
      {
        "id": 246,
        "name": "Total_LBAs_Written",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 619033175,
          "string": "619033175"
        }
      },
      {
        "id": 247,
        "name": "Host_Program_Page_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 19344786,
          "string": "19344786"
        }
      },
      {
        "id": 248,
        "name": "FTL_Program_Page_Count",
        "value": 100,
        "worst": 100,
        "thresh": 50,
        "when_failed": "",
        "flags": {
          "value": 50,
          "string": "-O--CK ",
          "prefailure": false,
          "updated_online": true,
          "performance": false,
          "error_rate": false,
          "event_count": true,
          "auto_keep": true
        },
        "raw": {
          "value": 16994048,
          "string": "16994048"
        }
      }
    ]
  },
  "power_on_time": {
    "hours": 744
  },
  "power_cycle_count": 35,
  "temperature": {
    "current": 36
  },
  "ata_smart_error_log": {
    "summary": {
      "revision": 1,
      "count": 0
    }
  },
  "ata_smart_self_test_log": {
    "standard": {
      "revision": 1,
      "count": 0
    }
  }
}

I hope it's not too long. I only just not now how to format Percent_Lifetime_Remain as a json object.

boesing commented 3 years ago

I see. I'll try to create a parsing method for this. 👍🏻

carsten-h commented 3 years ago

Thank you! Is your "nvme_..._informations" property formed in another way?

boesing commented 3 years ago

Yah,

thats mine:

{
"nvme_smart_health_information_log": {
    "critical_warning": 0,
    "temperature": 36,
    "available_spare": 100,
    "available_spare_threshold": 5,
    "percentage_used": 0,
    "data_units_read": 519679,
    "data_units_written": 326973,
    "host_reads": 8780844,
    "host_writes": 7257199,
    "controller_busy_time": 472,
    "power_cycles": 33,
    "power_on_hours": 153,
    "unsafe_shutdowns": 13,
    "media_errors": 0,
    "num_err_log_entries": 0,
    "warning_temp_time": 0,
    "critical_comp_time": 0
  }
}

There seem to be quite many formats out there.

boesing commented 3 years ago

@carsten-h Could you also provide the RAW output of smartctl? Do you have access to the container so you can execute smartctl in there?

(with raw I mean the non-json Version of the Output)

I'd like to know if that raw value of Temperature_Celsius is some JSON related "problem" or if smartctl is not able to pass the proper value to the JSON.

You might also want to check if there is a firmware update available for your MX500. https://www.crucial.com/support/ssd-support/mx500-support

Please also read the "warning" so you safe your data before executing any firmware update.

If there is an update available within that "Tool" you find on that page, you might want to install that. Please note that I dont take any responsibility in doing this. I've updated my Crucial P2 250GB a few days ago aswell (but had no data on it).

Maybe there is something wrong with the reporting of the temperature value.

I will try to extract the reported value from that raw.string value from within the temperature attribute but having the proper 36 degrees reported in that raw.value would be the better solution tho.

carsten-h commented 3 years ago

Do you have access to the container so you can execute smartctl in there?

It seems that I am not having access to the correct container. When I am starting Home Assistant terminal /usr/sbin/smartctl is not found. I have a HassOS installation on a Pi 4.

You might also want to check if there is a firmware update available for your MX500.

It's a BX500. :-) I will look for it. I hope I can start it on my Mac when the drive is attached via USB, because that is the only possibility I have. I think the reported temperature now is correct. It is 35°C.

boesing commented 3 years ago

It's a BX500. :-) I will look for it. I hope I can start it on my Mac when the drive is attached via USB, because that is the only possibility I have.

You probably can use Virtualbox with Windows to install the update (thats how I did it).

I think the reported temperature now is correct. It is 35°C.

Yeah, the temperature is being provided by the temperature.current property which is used by this addon now (and which seems to be identical in every setup). However, in case you want to have access to the other attributes, we need a proper way to parse these attributes.

I've found a way to convert your list of attributes to an attributes object. Took some time as I was searching for the reason why the Temperature_Celsius in your output contains that high integer value 223340199972 as raw.value. I now just grab the integer value within the raw.string and ignore the rest of it.

This seems to work, I will try to add some tests for this. However, if you are fine with just the temperature value, feel free to close this issue :-)

carsten-h commented 3 years ago

I am fine with only the temperature value!

It‘s not my issue, so I can not close it.

boesing commented 3 years ago

If you dont need any attributes, you can leave the attributes_property empty 👍 However, I've created another PR (#6) which introduces a new parameter attributes_format which can be used to mark a property as a list which is then parsed to an object by this addon.

I've taken your JSON and the result would be:

{
  "raw_read_error_rate": 0,
  "reallocate_nand_blk_cnt": 0,
  "power_on_hours": 744,
  "power_cycle_count": 35,
  "program_fail_count": 0,
  "erase_fail_count": 0,
  "ave_block-erase_count": 9,
  "unexpect_power_loss_ct": 24,
  "unused_reserve_nand_blk": 100,
  "sata_interfac_downshift": 0,
  "error_correction_count": 0,
  "reported_uncorrect": 0,
  "temperature_celsius": 36,
  "reallocated_event_count": 0,
  "current_pending_sector": 0,
  "offline_uncorrectable": 0,
  "udma_crc_error_count": 29,
  "percent_lifetime_remain": 100,
  "write_error_rate": 0,
  "success_rain_recov_cnt": 0,
  "total_lbas_written": 619033175,
  "host_program_page_count": 19344786,
  "ftl_program_page_count": 16994048
}
boesing commented 3 years ago

Closing this for now as there is no further feedback.