Open Hr46ph opened 1 month ago
Not sure if any of the other logging is relevant, I figured I'd wait the resonse before supplying more info. Let me know what you need.
Thanks!
Getting exactly the same issue:
# sudo smartctl -a /dev/nvme0
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-45-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: CT1000T700SSD5
Serial Number: **redacted**
Firmware Version: PACR5101
PCI Vendor/Subsystem ID: 0xc0a9
IEEE OUI Identifier: 0x00a075
Controller ID: 0
NVMe Version: 2.0
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,000,204,886,016 [1.00 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 00a075 **redacted**
Local Time is: Fri Oct 4 20:09:17 2024 BST
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x3e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg Log0_FISE_MI
Maximum Data Transfer Size: 128 Pages
Warning Comp. Temp. Threshold: 87 Celsius
Critical Comp. Temp. Threshold: 89 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 11.50W - - 0 0 0 0 800 1000
1 + 8.00W - - 0 0 0 0 800 1000
2 + 6.00W - - 0 0 0 0 800 1000
3 - 0.1440W - - 0 0 0 0 3000 3000
4 - 0.1440W - - 0 0 0 0 3000 3000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 1
1 - 4096 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 33 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 4%
Data Units Read: 225,608,936 [115 TB]
Data Units Written: 51,358,910 [26.2 TB]
Host Read Commands: 6,634,833,183
Host Write Commands: 613,483,504
Controller Busy Time: 5,734
Power Cycles: 32
Power On Hours: 8,830
Unsafe Shutdowns: 4
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, 16 of 16 entries)
No Errors Logged
Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
No Self-tests Logged
Running master#57dc547
Describe the bug All 3 NVMe disks show as failed and I have no idea why. For one, I might have a clue but not for the 2 others.
The only place it shows failed is on the dashboard and when I click a disk, the label at 'status'.
Expected behavior Healthy disks because there is nothing wrong with them.
Screenshots
Log Files I have 2 of these:
And this is number 3: