AnalogJ / scrutiny

Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds
MIT License
5.38k stars 171 forks source link

[BUG] NVME disk not detected #723

Open njunghausz opened 1 day ago

njunghausz commented 1 day ago

Describe the bug Samsung SSD 980 500GB not detected

Expected behavior It should be detected

smartctl output from root shell:

sudo /sbin/smartctl -a /dev/nvme0 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-27-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Model Number: Samsung SSD 980 500GB Serial Number: S64DNX0RB08821R Firmware Version: 3B4QFXO7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 500,107,862,016 [500 GB] Unallocated NVM Capacity: 0 Controller ID: 5 NVMe Version: 1.4 Number of Namespaces: 1 Namespace 1 Size/Capacity: 500,107,862,016 [500 GB] Namespace 1 Utilization: 18,417,872,896 [18.4 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 db11b022d4 Local Time is: Sat Nov 23 16:07:53 2024 CET Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0055): Comp DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Namespace 1 Features (0x10): NP_Fields

Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 5.24W - - 0 0 0 0 0 0 1 + 4.49W - - 1 1 1 1 0 0 2 + 2.19W - - 2 2 2 2 0 500 3 - 0.0500W - - 3 3 3 3 210 1200 4 - 0.0050W - - 4 4 4 4 1000 9000

Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0

=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 43 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 3% Data Units Read: 46,068,728 [23.5 TB] Data Units Written: 27,445,104 [14.0 TB] Host Read Commands: 627,762,128 Host Write Commands: 577,480,352 Controller Busy Time: 759 Power Cycles: 3,461 Power On Hours: 2,750 Unsafe Shutdowns: 260 Media and Data Integrity Errors: 0 Error Information Log Entries: 0 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 43 Celsius Temperature Sensor 2: 47 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries) No Errors Logged

sudo /sbin/smartctl --scan /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device /dev/sdc -d scsi # /dev/sdc, SCSI device /dev/sdd -d scsi # /dev/sdd, SCSI device /dev/nvme0 -d nvme # /dev/nvme0, NVMe device

smartctl output from docker:

docker exec -t -i scrutiny /usr/sbin/smartctl --scan /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device /dev/sdc -d scsi # /dev/sdc, SCSI device /dev/sdd -d scsi # /dev/sdd, SCSI device

docker properties:

docker inspect ff8fd0c95b5f [ { "Id": "ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7", "Created": "2024-11-23T14:42:06.869112045Z", "Path": "/init", "Args": [], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 69119, "ExitCode": 0, "Error": "", "StartedAt": "2024-11-23T14:42:06.96179339Z", "FinishedAt": "0001-01-01T00:00:00Z" }, "Image": "sha256:71d67b1137ef561e3214a883b7804f22942720462f7ea68a15c06acbda7bae4a", "ResolvConfPath": "/containers/containers/ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7/resolv.conf", "HostnamePath": "/containers/containers/ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7/hostname", "HostsPath": "/containers/containers/ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7/hosts", "LogPath": "/containers/containers/ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7/ff8fd0c95b5f318b14da993f4562a7e45b8a8b82ef60054dc508e3c427a3c0f7-json.log", "Name": "/scrutiny", "RestartCount": 0, "Driver": "overlay2", "Platform": "linux", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "docker-default", "ExecIDs": null, "HostConfig": { "Binds": [ "/compose/scrutiny/influxdb:/config/scrutiny/influxdb:rw", "/run/udev:/run/udev:ro", "/compose/scrutiny/config:/config/scrutiny:rw" ], "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": {} }, "NetworkMode": "scrutiny_default", "PortBindings": { "8080/tcp": [ { "HostIp": "", "HostPort": "8080" } ], "8086/tcp": [ { "HostIp": "", "HostPort": "8086" } ] }, "RestartPolicy": { "Name": "no", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "ConsoleSize": [ 0, 0 ], "CapAdd": [ "SYS_RAWIO", "SYS_ADMIN" ], "CapDrop": null, "CgroupnsMode": "private", "Dns": null, "DnsOptions": null, "DnsSearch": null, "ExtraHosts": [], "GroupAdd": null, "IpcMode": "private", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": false, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": null, "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "Isolation": "", "CpuShares": 0, "Memory": 0, "NanoCpus": 0, "CgroupParent": "", "BlkioWeight": 0, "BlkioWeightDevice": null, "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": [ { "PathOnHost": "/dev/nvme0n1", "PathInContainer": "/dev/nvme0n1", "CgroupPermissions": "rwm" }, { "PathOnHost": "/dev/sda", "PathInContainer": "/dev/sda", "CgroupPermissions": "rwm" }, { "PathOnHost": "/dev/sdb", "PathInContainer": "/dev/sdb", "CgroupPermissions": "rwm" }, { "PathOnHost": "/dev/sdc", "PathInContainer": "/dev/sdc", "CgroupPermissions": "rwm" }, { "PathOnHost": "/dev/sdd", "PathInContainer": "/dev/sdd", "CgroupPermissions": "rwm" } ], "DeviceCgroupRules": null, "DeviceRequests": null, "MemoryReservation": 0, "MemorySwap": 0, "MemorySwappiness": null, "OomKillDisable": null, "PidsLimit": null, "Ulimits": null, "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0, "MaskedPaths": [ "/proc/asound", "/proc/acpi", "/proc/kcore", "/proc/keys", "/proc/latency_stats", "/proc/timer_list", "/proc/timer_stats", "/proc/sched_debug", "/proc/scsi", "/sys/firmware", "/sys/devices/virtual/powercap" ], "ReadonlyPaths": [ "/proc/bus", "/proc/fs", "/proc/irq", "/proc/sys", "/proc/sysrq-trigger" ] }, "GraphDriver": { "Data": { "LowerDir": "/containers/overlay2/63cf0dd278d7430d4e40da875770c81a28efae4a08801bd23bc2bbf9aa596364-init/diff:/containers/overlay2/aed0dd9ccbb712d2f41c6e8d9c6543efbd6150d57a5ebdd47f321bf74999bf13/diff:/containers/overlay2/5c8da922c524565b85c282f7f3b6107cc3bc1a54cd6265cf518c9e4234274ba4/diff:/containers/overlay2/301c3bd65bf35539bf7211e25ad4a194972847159e5a788a356ceecb8bfb6deb/diff:/containers/overlay2/5643d49cd134517a80399cbd4424e9b71e651a6e6e2f3f28257823a25c7df211/diff:/containers/overlay2/3da216e779d887adf4d6c2657dc36e3b8abffca63981cc4ba040d8b05bf9457a/diff:/containers/overlay2/13a140118e7913eee10c060f9439948ce825369c5f2ee1f176f9b62f28cb14f9/diff:/containers/overlay2/e0e596dd3e0f44afad15bada2e210911cafefb902a7d7bd5c0aae6a5edb2792e/diff:/containers/overlay2/3623903f4bc8e76ee7e2a465ea745ca2f5fd183c753ab3bc0f1dfb09c6d8d4cd/diff:/containers/overlay2/d39037c835e1b99373abe64e1681035b2415e76d7191ea10ead435dac89656f5/diff", "MergedDir": "/containers/overlay2/63cf0dd278d7430d4e40da875770c81a28efae4a08801bd23bc2bbf9aa596364/merged", "UpperDir": "/containers/overlay2/63cf0dd278d7430d4e40da875770c81a28efae4a08801bd23bc2bbf9aa596364/diff", "WorkDir": "/containers/overlay2/63cf0dd278d7430d4e40da875770c81a28efae4a08801bd23bc2bbf9aa596364/work" }, "Name": "overlay2" }, "Mounts": [ { "Type": "bind", "Source": "/compose/scrutiny/influxdb", "Destination": "/config/scrutiny/influxdb", "Mode": "rw", "RW": true, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/run/udev", "Destination": "/run/udev", "Mode": "ro", "RW": false, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/compose/scrutiny/config", "Destination": "/config/scrutiny", "Mode": "rw", "RW": true, "Propagation": "rprivate" } ], "Config": { "Hostname": "ff8fd0c95b5f", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": true, "AttachStderr": true, "ExposedPorts": { "8080/tcp": {}, "8086/tcp": {} }, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PGID=100", "TZ=Europe/Budapest", "PUID=1001", "PATH=/opt/scrutiny/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "INFLUXD_CONFIG_PATH=/opt/scrutiny/influxdb", "S6VER=3.1.6.2", "INFLUXVER=2.2.0", "S6_SERVICES_READYTIME=1000" ], "Cmd": [ "/init" ], "Image": "ghcr.io/analogj/scrutiny:master-omnibus", "Volumes": null, "WorkingDir": "/opt/scrutiny", "Entrypoint": null, "OnBuild": null, "Labels": { "com.docker.compose.config-hash": "1c27a1fd58bf575b78993a2374b08ee22b0590f6f01927857e12763f5799d60e", "com.docker.compose.container-number": "1", "com.docker.compose.depends_on": "", "com.docker.compose.image": "sha256:71d67b1137ef561e3214a883b7804f22942720462f7ea68a15c06acbda7bae4a", "com.docker.compose.oneoff": "False", "com.docker.compose.project": "scrutiny", "com.docker.compose.project.config_files": "/compose/scrutiny/scrutiny.yml,/compose/scrutiny/compose.override.yml", "com.docker.compose.project.environment_file": "/compose/global.env,/compose/scrutiny/scrutiny.env", "com.docker.compose.project.working_dir": "/compose/scrutiny", "com.docker.compose.replace": "42ae3eda174df155a83f46e307878b380e6aa663e9f9085e77e80ae89650b663", "com.docker.compose.service": "scrutiny", "com.docker.compose.version": "2.29.7", "org.opencontainers.image.created": "2024-11-22T12:57:36.119Z", "org.opencontainers.image.description": "Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds", "org.opencontainers.image.licenses": "MIT", "org.opencontainers.image.revision": "0641b5e79d55d356e5b45227166cca2a66fd2555", "org.opencontainers.image.source": "https://github.com/AnalogJ/scrutiny", "org.opencontainers.image.title": "scrutiny", "org.opencontainers.image.url": "https://github.com/AnalogJ/scrutiny", "org.opencontainers.image.version": "master-omnibus" } }, "NetworkSettings": { "Bridge": "", "SandboxID": "4663e56bc4e885a09455b1ea798986773ed3ce82c45c4e39fd5fa246db56bd11", "SandboxKey": "/var/run/docker/netns/4663e56bc4e8", "Ports": { "8080/tcp": [ { "HostIp": "0.0.0.0", "HostPort": "8080" }, { "HostIp": "::", "HostPort": "8080" } ], "8086/tcp": [ { "HostIp": "0.0.0.0", "HostPort": "8086" }, { "HostIp": "::", "HostPort": "8086" } ] }, "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "MacAddress": "", "Networks": { "scrutiny_default": { "IPAMConfig": null, "Links": null, "Aliases": [ "scrutiny", "scrutiny" ], "MacAddress": "02:42:ac:13:00:02", "DriverOpts": null, "NetworkID": "87bae3c70dc3b06874c02c74135e8f07180637ea1fa2f5595d7a299f7d0f3537", "EndpointID": "0d8efb38658b1053df4deca2cdbbdb18327a6ff645f0c3b41d229fa642ccb406", "Gateway": "172.19.0.1", "IPAddress": "172.19.0.2", "IPPrefixLen": 16, "IPv6Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "DNSNames": [ "scrutiny", "ff8fd0c95b5f" ] } } } } ]

docker exec scrutiny scrutiny-collector-metrics run 2024/11/23 16:11:18 No configuration file found at /opt/scrutiny/config/collector.yaml. Using Defaults.


/ ) / )( ( )( )( )( )( ( )( \/ ) _ ( ( ) / )()( )( )( ) ( \ / (/ \)()_)(__) () (__)()_) (__) AnalogJ/scrutiny/metrics dev-0.8.1

time="2024-11-23T16:11:18+01:00" level=info msg="Verifying required tools" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --scan --json" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --info --json /dev/sdb" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Generating WWN" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --info --json /dev/sdc" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Generating WWN" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --info --json /dev/sdd" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Generating WWN" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --info --json /dev/sda" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Generating WWN" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Sending detected devices to API, for filtering & validation" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Collecting smartctl results for sdb\n" type=metrics time="2024-11-23T16:11:18+01:00" level=info msg="Executing command: smartctl --xall --json --device sat /dev/sdb" type=metrics time="2024-11-23T16:11:21+01:00" level=error msg="smartctl returned an error code (64) while processing sdb\n" type=metrics time="2024-11-23T16:11:21+01:00" level=error msg="smartctl detected a error log with errors" type=metrics time="2024-11-23T16:11:21+01:00" level=info msg="Publishing smartctl results for 0x50000398bbd010e2\n" type=metrics time="2024-11-23T16:11:22+01:00" level=info msg="Collecting smartctl results for sdc\n" type=metrics time="2024-11-23T16:11:22+01:00" level=info msg="Executing command: smartctl --xall --json --device sat /dev/sdc" type=metrics time="2024-11-23T16:11:22+01:00" level=info msg="Publishing smartctl results for 0x5000039b88d37ff2\n" type=metrics time="2024-11-23T16:11:23+01:00" level=info msg="Collecting smartctl results for sdd\n" type=metrics time="2024-11-23T16:11:23+01:00" level=info msg="Executing command: smartctl --xall --json --device sat /dev/sdd" type=metrics time="2024-11-23T16:11:26+01:00" level=error msg="smartctl returned an error code (64) while processing sdd\n" type=metrics time="2024-11-23T16:11:26+01:00" level=error msg="smartctl detected a error log with errors" type=metrics time="2024-11-23T16:11:26+01:00" level=info msg="Publishing smartctl results for 0x50000398bbd011a0\n" type=metrics time="2024-11-23T16:11:27+01:00" level=info msg="Collecting smartctl results for sda\n" type=metrics time="2024-11-23T16:11:27+01:00" level=info msg="Executing command: smartctl --xall --json --device sat /dev/sda" type=metrics time="2024-11-23T16:11:27+01:00" level=info msg="Publishing smartctl results for 0x5000039b88d2b90d\n" type=metrics time="2024-11-23T16:11:27+01:00" level=info msg="Main: Completed" type=metrics

The log files will be available on your host in the config directory. Please attach them to this issue.

Please also provide the output of docker info

docker info

Client: Docker Engine - Community Version: 27.3.1 Context: default Debug Mode: false Plugins: buildx: Docker Buildx (Docker Inc.) Version: v0.17.1 Path: /usr/libexec/docker/cli-plugins/docker-buildx compose: Docker Compose (Docker Inc.) Version: v2.29.7 Path: /usr/libexec/docker/cli-plugins/docker-compose

Server: Containers: 4 Running: 4 Paused: 0 Stopped: 0 Images: 4 Server Version: 27.3.1 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Using metacopy: false Native Overlay Diff: true userxattr: false Logging Driver: json-file Cgroup Driver: systemd Cgroup Version: 2 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 runc Default Runtime: runc Init Binary: docker-init containerd version: 57f17b0a6295a39009d861b89e3b3b87b005ca27 runc version: v1.1.14-0-g2c9f560 init version: de40ad0 Security Options: apparmor seccomp Profile: builtin cgroupns Kernel Version: 6.1.0-27-amd64 Operating System: Debian GNU/Linux 12 (bookworm) OSType: linux Architecture: x86_64 CPUs: 12 Total Memory: 15.47GiB Name: cookiemonster ID: 07bcb594-f056-42f3-b8fd-bd38172dd56d Docker Root Dir: /containers Debug Mode: false Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled WARNING: bridge-nf-call-ip6tables is disabled

njunghausz commented 1 day ago

scrutiny.log

log file attached