xirixiz / dsmr-reader-docker

DSMR Reader in Docker.
https://hub.docker.com/r/xirixiz/dsmr-reader-docker
114 stars 33 forks source link

The dsmr_webinterface won't start after upgrading to latest image #244

Closed sbosman84 closed 2 years ago

sbosman84 commented 2 years ago

Support guidelines

I've found an issue and checked that ...

Description

After recreating the container based on the latest image (created at 2021-12-21 22:41:09). It won't start the webinterface part.

The latest output of the container keeps on repeating: curl: option --connect-timeout 15 --silent --show-error --fail: is unknown curl: try 'curl --help' or 'curl --manual' for more information

After running the reload.sh script inside the container it shows that the dsmr_webinterface is not running as well:

image

Expected behaviour

The container used to start normal before. But after the upgrade it doesn't start the web interface.

Actual behaviour

After the upgrade to the latest image version to start as normal

Steps to reproduce

  1. Upgrade the container image to Docker 20.10.11+azure-3 on linux, arm64
  2. Recreate the container
  3. Restart the container

Docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)

Server:
 Containers: 11
  Running: 11
  Paused: 0
  Stopped: 0
 Images: 26
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1047-raspi
 Operating System: Ubuntu 20.04.3 LTS
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.627GiB
 Name: 
 ID: BZRU:HNDP:K2DY:7GN4:FWWI:NS4G:FG36:ZAWK:XGKW:E7CM:YWO3:B2XH
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory TCP limit support
WARNING: No oom kill disable support

Version

Docker compose

dsmr:
    container_name: dsmr
    depends_on:
      postgres:
        condition: service_started
    environment:
      DATALOGGER_MODE: receiver
      DJANGO_DATABASE_HOST: *
      DJANGO_DATABASE_NAME: *
      DJANGO_DATABASE_PASSWORD: *
      DJANGO_DATABASE_USER: *
      DJANGO_SECRET_KEY: *
      DJANGO_TIME_ZONE: Europe/Amsterdam
      DSMRREADER_ADMIN_PASSWORD: *
      DSMRREADER_ADMIN_USER: admin
      DSMRREADER_LOGLEVEL: WARNING
      ENABLE_NGINX_SSL: "false"
      SD_AUTORESTART_DATALOGGER: "false"
      SD_AUTOSTART_DATALOGGER: "false"
      VIRTUAL_HOST: localhost
    image: xirixiz/dsmr-reader-docker:latest  
    links:
    - postgres
    networks:
      web: null
    restart: unless-stopped
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /var/lib/docker-data/dsmr/backups:/dsmr/backups:rw

Container logs

[s6-init] making user provided files available at /var/run/s6/etc...exited 0. [s6-init] ensuring user provided files have correct perms...exited 0. [fix-attrs.d] applying ownership & permissions fixes... [fix-attrs.d] done. [cont-init.d] executing container initialization scripts... [cont-init.d] 10-set-app-defaults: executing...

[ INFO ] DSMR release: 4.19.0

[ INFO ] Creating log directory...

[ INFO ] Setting architecture requirements...

[ INFO ] ARM Architecture

[ INFO ] Verifying if the DSMR web credential variables have been set...

[ INFO ] Verifying database connectivity to host: postgres with port: 5432...

[ INFO ] Database connectivity successfully verified!

[ INFO ] Running post configuration... Operations to perform: Apply all migrations: admin, auth, contenttypes, dsmr_api, dsmr_backend, dsmr_backup, dsmr_consumption, dsmr_datalogger, dsmr_dropbox, dsmr_frontend, dsmr_influxdb, dsmr_mindergas, dsmr_mqtt, dsmr_notification, dsmr_pvoutput, dsmr_stats, dsmr_weather, sessions Running migrations: No migrations to apply. 563 static files copied to '/var/www/dsmrreader/static'. Updating password of superuser "admin" Deactivating any other existing superusers

[ INFO ] Checking for NGINX SSL configuration...

[ INFO ] ENABLE_NGINX_SSL is disabled, nothing to see here. Continuing...

[ INFO ] Checking for HTTP AUTHENTICATION configuration... [ INFO ] ENABLE_HTTP_AUTH is disabled, nothing to see here. Continuing...

[ INFO ] Configuring DSMR in receiver datalogger mode.... [cont-init.d] 10-set-app-defaults: exited 0. [cont-init.d] done. [services.d] starting services Starting DSMR Reader - webinterface... Starting DSMR Reader - backend... Starting DSMR Reader - nginx... [services.d] done. [2021-12-23 18:30:23 +0100] [240] [INFO] Starting gunicorn 20.1.0 [2021-12-23 18:30:23 +0100] [240] [INFO] Listening at: unix:/var/tmp/gunicorn--dsmr_webinterface.socket (240) [2021-12-23 18:30:23 +0100] [240] [INFO] Using worker: sync [2021-12-23 18:30:23 +0100] [268] [INFO] Booting worker with pid: 268

Additional info

No response

xirixiz commented 2 years ago

Currenlty it's not completly clear when the issue appears. The first thing I'd like to address:

Looking at the last log it seems that the webserver is running like it should. Can you start the container and then execute this command:

docker exec -ti dsmr bash -c 'curl --connect-timeout 15 --silent --show-error --fail "http://localhost/about" -o /dev/null -w "%{http_code}\n"'

If the output is 200, then we at least know the webinterface is running properly within the container, from there on we can investigate further.

The thing that's not completly clear to me is the curl error. When do you see this error? After restarting the container or after trying to run the reload script?

Did you try:

docker system prune -a -f

To clean up some possible Docker leftovers? Are you using portainer by any chance?

xirixiz commented 2 years ago

It's not related to the issue you are facing, but indeed there was an error in the healthcheck :) Fixed it and creating a new release.

-> docker inspect --format='{{json .State.Health}}' dsmr | jq
{
  "Status": "unhealthy",
  "FailingStreak": 16298,
  "Log": [
    {
      "Start": "2021-12-24T08:06:39.097925957+01:00",
      "End": "2021-12-24T08:06:39.174677066+01:00",
      "ExitCode": 1,
      "Output": "curl: option --connect-timeout 15 --silent --show-error --fail: is unknown\ncurl: try 'curl --help' or 'curl --manual' for more information\n"
    },
    {
      "Start": "2021-12-24T08:06:49.537514638+01:00",
      "End": "2021-12-24T08:06:49.611489137+01:00",
      "ExitCode": 1,
      "Output": "curl: option --connect-timeout 15 --silent --show-error --fail: is unknown\ncurl: try 'curl --help' or 'curl --manual' for more information\n"
    },
    {
      "Start": "2021-12-24T08:06:59.886476807+01:00",
      "End": "2021-12-24T08:06:59.957856964+01:00",
      "ExitCode": 1,
      "Output": "curl: option --connect-timeout 15 --silent --show-error --fail: is unknown\ncurl: try 'curl --help' or 'curl --manual' for more information\n"
    },
    {
      "Start": "2021-12-24T08:07:10.137309887+01:00",
      "End": "2021-12-24T08:07:10.215368499+01:00",
      "ExitCode": 1,
      "Output": "curl: option --connect-timeout 15 --silent --show-error --fail: is unknown\ncurl: try 'curl --help' or 'curl --manual' for more information\n"
    },
    {
      "Start": "2021-12-24T08:07:20.531156376+01:00",
      "End": "2021-12-24T08:07:20.603078468+01:00",
      "ExitCode": 1,
      "Output": "curl: option --connect-timeout 15 --silent --show-error --fail: is unknown\ncurl: try 'curl --help' or 'curl --manual' for more information\n"
    }
  ]
}
sbosman84 commented 2 years ago

I'm indeed using portainer, so your last comment explains the health check output indeed.

Just ran the command: docker exec -ti dsmr bash -c 'curl --connect-timeout 15 --silent --show-error --fail "http://localhost/about" -o /dev/null -w "%{http_code}\n"'

But got an 302 reponse. That triggered me that it might be my traefik configuration. But after removing this configuration I still got an 302.

xirixiz commented 2 years ago

Aaah, then Portainer is causing the issue. Portainer sucks πŸ˜ƒ, it messes up things when Docker images contain big changes. Reason is unknown to me.

Maybe this will help or the docker system prune -a -f command can help here.

https://github.com/xirixiz/dsmr-reader-docker/issues/242 https://github.com/xirixiz/dsmr-reader-docker/issues?q=is%3Aissue+is%3Aclosed+portainer

New release is comming up in about half an hour, where at least the health check is working properly.

xirixiz commented 2 years ago

Healthcheck at least fixed in the latest release now:

-> docker inspect --format='{{json .State.Health}}' dsmr | jq
{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2021-12-24T13:37:36.766172379+01:00",
      "End": "2021-12-24T13:37:37.70885549+01:00",
      "ExitCode": 0,
      "Output": "HTTP_200"
    },
    {
      "Start": "2021-12-24T13:37:42.779393686+01:00",
      "End": "2021-12-24T13:37:42.881481349+01:00",
      "ExitCode": 0,
      "Output": "HTTP_200"
    },
    {
      "Start": "2021-12-24T13:37:47.93278271+01:00",
      "End": "2021-12-24T13:37:48.039101163+01:00",
      "ExitCode": 0,
      "Output": "HTTP_200"
    },
    {
      "Start": "2021-12-24T13:37:53.084808646+01:00",
      "End": "2021-12-24T13:37:53.193379264+01:00",
      "ExitCode": 0,
      "Output": "HTTP_200"
    },
    {
      "Start": "2021-12-24T13:37:58.215012823+01:00",
      "End": "2021-12-24T13:37:58.328030048+01:00",
      "ExitCode": 0,
      "Output": "HTTP_200"
    }
  ]
}
sbosman84 commented 2 years ago

Thanks! I upgraded to the latest version and the healthcheck is working again!

I also figured out what my issue was. I got the 302 when executing the following command: docker exec -ti dsmr bash -c 'curl --connect-timeout 15 --silent --show-error --fail "http://localhost/about" -o /dev/null -w "%{http_code}\n"' Because it was redirecting the /about towards /admin/login/?next=/about due to the authentication I had enabled for the whole page.

This triggered me to check my Traefik config again and noticed that (probably due to an upgrade as well) it requires that I need to specify a port as well. Which it didn't need before. :-(

So long story -> short; this was a Traefik configuration issue and not a DSMR or Portainer issue. Thanks for the support and also for solving the health check issue this quick. My dashboards are green again and DSMR is working. :-)

xirixiz commented 2 years ago

Good to hear and also good to hear it wasn't portainer πŸ˜„

xirixiz commented 2 years ago

Thanks for providing the information!