AnalogJ / scrutiny

Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds
MIT License
5.08k stars 165 forks source link

Question: Is it possible to force show all disks? #582

Open ameer1234567890 opened 7 months ago

ameer1234567890 commented 7 months ago

I use the omnibus docker image and 1 out of 3 drives are shown in the web UI despite device mappings in the compose file. I understand that it might be because of some missing SMART data. Is it possible to force scrutiny to show disks despite the missing data?

AnalogJ commented 7 months ago

unfortunately not. Scrutiny is just a UI wrapper around smartmontools. I smartmontools can't detect your drive, Scrutiny will not receive any data about it.

ameer1234567890 commented 7 months ago

I tried smartctl --scan from within the conntainer and it shows 2 drives. I even tried smartctl -d sat --all /dev/sda which also shows data for both the devices detected by scan command. Still, the UI only shows 1 drive.

Here is container logs:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-timezone: executing... 
[cont-init.d] 01-timezone: exited 0.
[cont-init.d] 50-cron-config: executing... 
[cont-init.d] 50-cron-config: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
waiting for scrutiny service to start
waiting for influxdb
starting cron
influxdb config file already exists. skipping.
starting influxdb
scrutiny api not ready
influxdb not ready
influxdb not ready
scrutiny api not ready
ts=2024-02-27T16:03:38.322296Z lvl=info msg="Welcome to InfluxDB" log_id=0nbplA0W000 version=v2.2.0 commit=a2f8538837 build_date=2022-04-06T17:36:38Z
ts=2024-02-27T16:03:45.748642Z lvl=info msg="Resources opened" log_id=0nbplA0W000 service=bolt path=/opt/scrutiny/influxdb/influxd.bolt
ts=2024-02-27T16:03:46.836493Z lvl=info msg="Resources opened" log_id=0nbplA0W000 service=sqlite path=/opt/scrutiny/influxdb/influxd.sqlite
influxdb not ready
scrutiny api not ready
ts=2024-02-27T16:03:56.292211Z lvl=info msg="Checking InfluxDB metadata for prior version." log_id=0nbplA0W000 bolt_path=/opt/scrutiny/influxdb/influxd.bolt
ts=2024-02-27T16:03:56.976794Z lvl=info msg="Using data dir" log_id=0nbplA0W000 service=storage-engine service=store path=/opt/scrutiny/influxdb/engine/data
ts=2024-02-27T16:03:57.062636Z lvl=info msg="Compaction settings" log_id=0nbplA0W000 service=storage-engine service=store max_concurrent_compactions=2 throughput_bytes_per_second=50331648 throughput_bytes_per_second_burst=50331648
ts=2024-02-27T16:03:57.156915Z lvl=info msg="Open store (start)" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open op_event=start
ts=2024-02-27T16:04:00.889196Z lvl=info msg="index opened with 8 partitions" log_id=0nbplA0W000 service=storage-engine index=tsi
ts=2024-02-27T16:04:00.902022Z lvl=info msg="index opened with 8 partitions" log_id=0nbplA0W000 service=storage-engine index=tsi
ts=2024-02-27T16:04:00.906568Z lvl=info msg="index opened with 8 partitions" log_id=0nbplA0W000 service=storage-engine index=tsi
ts=2024-02-27T16:04:02.203487Z lvl=info msg="index opened with 8 partitions" log_id=0nbplA0W000 service=storage-engine index=tsi
ts=2024-02-27T16:04:03.178380Z lvl=info msg="Opened file" log_id=0nbplA0W000 service=storage-engine engine=tsm1 service=filestore path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/2/000000007-000000002.tsm id=0 duration=45.478ms
ts=2024-02-27T16:04:03.465516Z lvl=info msg="Opened file" log_id=0nbplA0W000 service=storage-engine engine=tsm1 service=filestore path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/5/000000002-000000002.tsm id=0 duration=568.825ms
ts=2024-02-27T16:04:03.466325Z lvl=info msg="Opened file" log_id=0nbplA0W000 service=storage-engine engine=tsm1 service=filestore path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/1/000000011-000000002.tsm id=0 duration=569.633ms
ts=2024-02-27T16:04:03.668792Z lvl=info msg="Opened file" log_id=0nbplA0W000 service=storage-engine engine=tsm1 service=filestore path=/opt/scrutiny/influxdb/engine/data/9116343cc6faae19/autogen/4/000000001-000000001.tsm id=0 duration=201.442ms
ts=2024-02-27T16:04:03.629509Z lvl=info msg="Opened shard" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/5 duration=5246.707ms
ts=2024-02-27T16:04:03.669319Z lvl=info msg="Opened shard" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/1 duration=5071.807ms
ts=2024-02-27T16:04:03.629957Z lvl=info msg="Opened shard" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/opt/scrutiny/influxdb/engine/data/66a5d1d3a2a00bb6/autogen/2 duration=5247.169ms
ts=2024-02-27T16:04:04.159379Z lvl=info msg="index opened with 8 partitions" log_id=0nbplA0W000 service=storage-engine index=tsi
ts=2024-02-27T16:04:03.670177Z lvl=info msg="Opened shard" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/opt/scrutiny/influxdb/engine/data/9116343cc6faae19/autogen/4 duration=2801.893ms
ts=2024-02-27T16:04:04.466511Z lvl=info msg="Opened file" log_id=0nbplA0W000 service=storage-engine engine=tsm1 service=filestore path=/opt/scrutiny/influxdb/engine/data/a13b3c8116eec8d8/autogen/3/000000001-000000001.tsm id=0 duration=48.991ms
ts=2024-02-27T16:04:04.467253Z lvl=info msg="Opened shard" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open index_version=tsi1 path=/opt/scrutiny/influxdb/engine/data/a13b3c8116eec8d8/autogen/3 duration=682.628ms
ts=2024-02-27T16:04:04.847550Z lvl=info msg="Open store (end)" log_id=0nbplA0W000 service=storage-engine service=store op_name=tsdb_open op_event=end op_elapsed=7690.647ms
ts=2024-02-27T16:04:04.954108Z lvl=info msg="Starting retention policy enforcement service" log_id=0nbplA0W000 service=retention check_interval=30m
ts=2024-02-27T16:04:04.954371Z lvl=info msg="Starting precreation service" log_id=0nbplA0W000 service=shard-precreation check_interval=10m advance_period=30m
scrutiny api not ready
influxdb not ready
ts=2024-02-27T16:04:12.003221Z lvl=info msg="Starting query controller" log_id=0nbplA0W000 service=storage-reads concurrency_quota=1024 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=1024
ts=2024-02-27T16:04:17.469501Z lvl=info msg="Configuring InfluxQL statement executor (zeros indicate unlimited)." log_id=0nbplA0W000 max_select_point=0 max_select_series=0 max_select_buckets=0
influxdb not ready
scrutiny api not ready
ts=2024-02-27T16:04:28.273674Z lvl=info msg=Listening log_id=0nbplA0W000 service=tcp-listener transport=http addr=:8086 port=8086
scrutiny api not ready
starting scrutiny
 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
2024/02/27 16:04:35 No configuration file found at /opt/scrutiny/config/scrutiny.yaml. Using Defaults.
github.com/AnalogJ/scrutiny                             dev-0.7.3
Start the scrutiny server
time="2024-02-27T16:04:35Z" level=info msg="Trying to connect to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n" type=web
time="2024-02-27T16:04:35Z" level=info msg="Successfully connected to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n" type=web
time="2024-02-27T16:04:35Z" level=info msg="InfluxDB certificate verification: true\n" type=web
scrutiny api not ready
time="2024-02-27T16:04:39Z" level=info msg="Database migration starting. Please wait, this process may take a long time...." type=web
time="2024-02-27T16:04:39Z" level=info msg="Database migration completed successfully" type=web
time="2024-02-27T16:04:39Z" level=info msg="SQLite global configuration migrations starting. Please wait...." type=web
time="2024-02-27T16:04:39Z" level=info msg="SQLite global configuration migrations completed successfully" type=web
time="2024-02-27T16:04:46Z" level=info msg="127.0.0.1 - 6560eb1e8a5d [27/Feb/2024:16:04:46 +0000] \"HEAD /api/health\" 200 0 \"\" \"curl/7.74.0\" (168ms)" clientIP=127.0.0.1 hostname=6560eb1e8a5d latency=168 method=HEAD path=/api/health referer= respLength=0 statusCode=200 type=web userAgent=curl/7.74.0
starting scrutiny collector (run-once mode. subsequent calls will be triggered via cron service)
2024/02/27 16:04:46 No configuration file found at /opt/scrutiny/config/collector.yaml. Using Defaults.
 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                                dev-0.7.3
time="2024-02-27T16:04:46Z" level=info msg="Verifying required tools" type=metrics
time="2024-02-27T16:04:46Z" level=info msg="Executing command: smartctl --scan --json" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="Executing command: smartctl --info --json --device sat /dev/sdb" type=metrics
time="2024-02-27T16:04:47Z" level=error msg="Could not retrieve device information for sdb: exit status 2" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="Executing command: smartctl --info --json --device sat /dev/sdc" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="Generating WWN" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="Sending detected devices to API, for filtering & validation" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="127.0.0.1 - 6560eb1e8a5d [27/Feb/2024:16:04:47 +0000] \"POST /api/devices/register\" 200 590 \"\" \"Go-http-client/1.1\" (102ms)" clientIP=127.0.0.1 hostname=6560eb1e8a5d latency=102 method=POST path=/api/devices/register referer= respLength=590 statusCode=200 type=web userAgent=Go-http-client/1.1
time="2024-02-27T16:04:47Z" level=info msg="Collecting smartctl results for sdc\n" type=metrics
time="2024-02-27T16:04:47Z" level=info msg="Executing command: smartctl --xall --json --device sat /dev/sdc" type=metrics
time="2024-02-27T16:04:48Z" level=error msg="smartctl returned an error code (4) while processing sdc\n" type=metrics
time="2024-02-27T16:04:48Z" level=error msg="smartctl detected a checksum error" type=metrics
time="2024-02-27T16:04:48Z" level=info msg="Publishing smartctl results for 0x5000c5008a09a13b\n" type=metrics
2024/02/27 16:04:49 /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/database/scrutiny_repository_device.go:51 SLOW SQL >= 200ms
[232.817ms] [rows:1] UPDATE `devices` SET `created_at`="2024-02-20 16:32:36.377",`updated_at`="2024-02-27 16:04:48.791",`wwn`="0x5000c5008a09a13b",`device_name`="sdc",`device_serial_id`="ata-ST500LM000-1EJ162_W764S7ZL",`model_name`="ST500LM000-1EJ162",`interface_speed`="3.0 Gb/s",`serial_number`="W764S7ZL",`firmware`="DEMC",`rotation_speed`=5400,`capacity`=500107862016,`form_factor`="2.5 inches",`device_protocol`="ATA",`device_type`="sat",`device_status`="2" WHERE `wwn` = "0x5000c5008a09a13b"
time="2024-02-27T16:04:51Z" level=info msg="No notification endpoints configured. Skipping failure notification." type=web
time="2024-02-27T16:04:51Z" level=info msg="127.0.0.1 - 6560eb1e8a5d [27/Feb/2024:16:04:51 +0000] \"POST /api/device/0x5000c5008a09a13b/smart\" 200 16 \"\" \"Go-http-client/1.1\" (2631ms)" clientIP=127.0.0.1 hostname=6560eb1e8a5d latency=2631 method=POST path=/api/device/0x5000c5008a09a13b/smart referer= respLength=16 statusCode=200 type=web userAgent=Go-http-client/1.1
time="2024-02-27T16:04:51Z" level=info msg="Main: Completed" type=metrics
AnalogJ commented 6 months ago

can you check if both drives have the same serial number/WWN? Scrutiny uses the WWN to uniquely identify drives (since device path /dev/sda may change during system restart)

ameer1234567890 commented 6 months ago

I have checked and have different WWN numbers for both disks.