AnalogJ / scrutiny

Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds
MIT License
5.13k stars 167 forks source link

[BUG] ERROR: invalid character '<' looking for beginning of value #349

Closed karaolidis closed 2 years ago

karaolidis commented 2 years ago

Describe the bug I've been experiencing some really weird behavior recently: if I set web.listen.basepath to anything other than the default of '', the collector breaks with the following error message:

> docker exec scrutiny /opt/scrutiny/bin/scrutiny-collector-metrics run

2022/08/02 23:22:20 Loading configuration file: /opt/scrutiny/config/collector.yaml

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                               dev-0.4.16

time="2022-08-02T23:22:20+03:00" level=info msg="Verifying required tools" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --scan --json" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdd" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --info --json /dev/sde" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --info --json /dev/sda" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdb" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdc" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:22:20+03:00" level=info msg="Sending detected devices to API, for filtering & validation" type=metrics
2022/08/02 23:22:20 ERROR: invalid character '<' looking for beginning of value

The value of api.endpoint seems to not influence the error.

scrutiny.yaml:

version: 1

web:
  listen:
    port: 8080
    host: 0.0.0.0
    basepath: '/storage'

log:
  file: '/opt/scrutiny/config/log' #absolute or relative paths allowed, eg. web.log
  level: DEBUG

notify:
  level: 'warn' # 'warn' or 'error'
  urls:
      - [redacted]

collector.yaml (can be set to anything without changing proxy settings and still works)

api:
  endpoint: 'http://localhost:8080/storage'

log.txt

Expected behavior The following is an example of running the collector with basepath: '':

> docker exec scrutiny /opt/scrutiny/bin/scrutiny-collector-metrics run

2022/08/02 23:25:34 Loading configuration file: /opt/scrutiny/config/collector.yaml

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
AnalogJ/scrutiny/metrics                               dev-0.4.16

time="2022-08-02T23:25:34+03:00" level=info msg="Verifying required tools" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --scan --json" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --info --json /dev/sda" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdb" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdc" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --info --json /dev/sdd" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --info --json /dev/sde" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Generating WWN" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Sending detected devices to API, for filtering & validation" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Collecting smartctl results for sda\n" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Executing command: smartctl --xall --json /dev/sda" type=metrics
time="2022-08-02T23:25:34+03:00" level=info msg="Publishing smartctl results for 0x5002538e403be893\n" type=metrics
time="2022-08-02T23:25:35+03:00" level=info msg="Collecting smartctl results for sdb\n" type=metrics
time="2022-08-02T23:25:35+03:00" level=info msg="Executing command: smartctl --xall --json /dev/sdb" type=metrics
time="2022-08-02T23:25:36+03:00" level=info msg="Publishing smartctl results for 0x50014ee2bf1724c1\n" type=metrics
time="2022-08-02T23:25:40+03:00" level=info msg="Collecting smartctl results for sdc\n" type=metrics
time="2022-08-02T23:25:40+03:00" level=info msg="Executing command: smartctl --xall --json /dev/sdc" type=metrics
time="2022-08-02T23:25:40+03:00" level=error msg="smartctl returned an error code (68) while processing sdc\n" type=metrics
time="2022-08-02T23:25:40+03:00" level=error msg="smartctl detected a checksum error" type=metrics
time="2022-08-02T23:25:40+03:00" level=info msg="Publishing smartctl results for 0x50014ee20d526ebe\n" type=metrics
time="2022-08-02T23:25:46+03:00" level=info msg="Collecting smartctl results for sdd\n" type=metrics
time="2022-08-02T23:25:46+03:00" level=info msg="Executing command: smartctl --xall --json /dev/sdd" type=metrics
time="2022-08-02T23:25:46+03:00" level=info msg="Publishing smartctl results for 0x50014ee21329b6cf\n" type=metrics
time="2022-08-02T23:25:52+03:00" level=info msg="Collecting smartctl results for sde\n" type=metrics
time="2022-08-02T23:25:52+03:00" level=info msg="Executing command: smartctl --xall --json /dev/sde" type=metrics
time="2022-08-02T23:25:52+03:00" level=error msg="smartctl returned an error code (64) while processing sde\n" type=metrics
time="2022-08-02T23:25:52+03:00" level=error msg="smartctl detected a error log with errors" type=metrics
time="2022-08-02T23:25:52+03:00" level=info msg="Publishing smartctl results for 0x50014ee2bcff024d\n" type=metrics
time="2022-08-02T23:25:58+03:00" level=info msg="Main: Completed" type=metrics

docker-compose.yml

    scrutiny:
        image: ghcr.io/analogj/scrutiny:master-omnibus
        container_name: scrutiny
        restart: unless-stopped
        volumes:
            - '/home/nick/configs/analytics/scrutiny/config:/opt/scrutiny/config'
            - '/home/nick/configs/analytics/scrutiny/influxdb2:/opt/scrutiny/influxdb'
            - '/run/udev:/run/udev:ro'
        environment:
            - 'PUID=1000'
            - 'PGID=1000'
            - 'TZ=Europe/Athens'
        devices:
            - /dev/sda:/dev/sda
            - /dev/sdb:/dev/sdb
            - /dev/sdc:/dev/sdc
            - /dev/sdd:/dev/sdd
            - /dev/sde:/dev/sde
        cap_add:
            - SYS_RAWIO
            - SYS_ADMIN
        networks:
            - proxy

nginx.conf snippet

    location /storage {
        include /config/includes/default-location.conf;
        include /config/includes/location/auth.conf;

        set $upstream_proto http;
        set $upstream_app scrutiny;
        set $upstream_port 8080;

        proxy_pass $upstream_proto://$upstream_app:$upstream_port;
    }

docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.8.1-docker)
  compose: Docker Compose (Docker Inc., 2.4.1)

Server:
 Containers: 61
  Running: 57
  Paused: 0
  Stopped: 4
 Images: 201
 Server Version: 20.10.14
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: de8046a5501db9e0e478e1c10cbcfb21af4c6b2d.m
 runc version:
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.17.3-arch1-1
 Operating System: Arch Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.32GiB
 Name: jupiter
 ID: [redacted]
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  [redacted]
 Live Restore Enabled: false

I'd appreciate any help :)

karaolidis commented 2 years ago

This seems similar to #217 but the original author unfortunately closed the issue without explaining how they solved it.

AnalogJ commented 2 years ago
2022/08/02 23:22:20 ERROR: invalid character '<' looking for beginning of value

That message means that the response from the api is HTML rather than JSON. That's because the Scrutiny webserver is configured to return the index.html page if it cant find a route.

I see that you're running the scrutiny omnibus image behind Nginx.

Can you curl the following urls and paste the output below (i want to know if its JSON or not)

karaolidis commented 2 years ago

http://localhost:8080/storage/api/health seems to be returning JSON: {"success":true} http://YOUR_HOSTNAME_HERE/storage/api/health seems to be returning a 302 found page:

<html>
<head><title>302 Found</title></head>
<body>
<center><h1>302 Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

This redirects to the JSON response after that.

Now that I'm looking at it, this seems to be happening because I have Authelia in front of the /storage endpoint. Is there a way to have the collector point to the API via the internal docker network instead?

AnalogJ commented 2 years ago

is your collector still configured using the following config file?

api:
  endpoint: 'http://localhost:8080/storage'

In that case it should be using the internal route to the API.

Do you run multiple collectors, on different hosts?

karaolidis commented 2 years ago

Using both localhost and scrutiny (the container name) leads to the same issue. This is the only collector running at the moment.

AnalogJ commented 2 years ago

hm, in that case, I wonder if the collector isn't correctly picking up your api.endpoint change.

yeah, thats the reason:

bodyType=response clientIP=127.0.0.1 hostname=105b5dd257b6 latency=1 method=POST path=/api/devices/register

the path in the access logs you uploaded is /api/devices/register not /storage/api/devices/register

Let me see if I can reproduce this locally.

AnalogJ commented 2 years ago

In the meantime, can you try running your scrutiny container with the following environmental variable:

COLLECTOR_API_ENDPOINT=http://localhost:8080/storage

karaolidis commented 2 years ago

Setting the environment variable seems to fix the issue without any other changes. Any ideas on why the collector is not detecting the change?

AnalogJ commented 2 years ago

I'm trying to replicate this issue locally, but I have confirmed that a mounted collector.yaml file will be correctly parsed by the collector and the api.endpoint will be used.

Can you take a look at this troubleshooting guide and see if there's anything relevant/unique to your configuration?

https://github.com/AnalogJ/scrutiny/blob/master/docs/TROUBLESHOOTING_REVERSE_PROXY.md

karaolidis commented 2 years ago

The only difference is the Authelia redirection which is somehow fixed using the environment variable. Another reason I could think of would be invalid permissions on the mounted config file but I also tested that extensively and didn't get different results.

Closing the issue since this is probably unrelated to scrutiny. Thanks for all the help!

AnalogJ commented 2 years ago

weird. I just released v0.5.0 which adds some additional messages when in DEBUG mode -- it prints out a copy of the configuration object.

If you want to debug further, we can remove the env var, re-use the collector config file, turn on DEBUG mode, and then verify the settings were correctly parsed by Scrutiny.

ghnp5 commented 5 months ago

I've just had the exact same issue. Setting api.endpoint in collector.yaml didn't work, and the API call continued to not use the prefix. Once I set up the COLLECTOR_API_ENDPOINT env var, it worked.

(on docker)

The scrutiny.yaml config is definitely used, because the basedir isn't being ignored. I had collector.yaml in the same place.

I'm using scrutiny:master-omnibus.

Tampa commented 1 month ago

To add to this if you are using a reverse proxy you may need to add /api to the proxy pass. For some reason it gets a bit huffy over just plain / as location for the api endpoint.