jakowenko / double-take

Unified UI and API for processing and training images for facial recognition.
https://hub.docker.com/r/jakowenko/double-take
MIT License
1.25k stars 99 forks source link

HA Addon Crash/Stopping without error V1.7, V1.8, V1.9 and V1.10 [BUG] #213

Closed Haloooch closed 2 years ago

Haloooch commented 2 years ago

Describe the bug I've been having this issue for a long time, all the way back to V1.7. Till this point I keep falling back to V1.6 as it works without error. I've installed Double Take as an HA add-on, it starts without error. I'm able to log into Double Take and navigate around, I can see some captured images. Config looks good, nothing much in the logs. After a few seconds Double Take stops, when I drop back to the add-on page, I can see the addon has stopped. I can click start, log in again, but a few seconds later, it again stops.

This behavour has been happening across multiple HA and HAOS upgrades.

supervisor log: 22-05-24 10:16:14 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.10:3000 ssl:default [Connect call failed ('172.30.33.10', 3000)]

Double Take log: 22-05-24 10:13:05 info: Double Take v1.9.0-24bdcef 22-05-24 10:13:05 verbose: { auth: false, detect: { match: { save: true, base64: false, confidence: 97, purge: 36, min_area: 1000 }, unknown: { save: true, base64: false, confidence: 40, purge: 8, min_area: 0 } }, time: { timezone: 'UTC' }, frigate: { attempts: { latest: 5, snapshot: 0, mqtt: true, delay: 0 }, image: { height: 500 }, labels: [ 'person' ], update_sub_labels: false, url: 'http://192.168.1.166:5000' }, mqtt: { topics: { frigate: 'frigate/events', matches: 'double-take/matches', cameras: 'double-take/cameras', homeassistant: 'homeassistant' }, host: '192.168.1.166:1883', username: 'MQTTDoubleTake', password: '********' }, logs: { level: 'debug' }, ui: { path: '', theme: 'bootstrap4-dark-blue', editor: { theme: 'nord_dark' }, logs: { lines: 500 }, pagination: { limit: 50 }, thumbnails: { quality: 95, width: 500 } }, server: { port: 3000 }, purge: { matches: 24, unknown: 24 }, detectors: { compreface: { det_prob_threshold: 0.8, timeout: 15, url: 'http://192.168.1.166:8000', key: '********' } }, storage: { path: '/config/double-take', config: { path: '/config/double-take' }, secrets: { path: '/config', extension: 'yaml' }, media: { path: '/media/double-take' }, tmp: { path: '/dev/shm/double-take' } }, version: '1.9.0-24bdcef' } 22-05-24 10:13:06 info: MQTT: connected 22-05-24 10:13:06 info: MQTT: subscribed to frigate/events, frigate/+/person/snapshot 22-05-24 10:13:10 verbose: 1653351175.291872-ay2fwo - car label not in (person) 22-05-24 10:13:10 verbose: 1653351175.291872-ay2fwo - car label not in (person) Version of Double Take V1.10

Hardware

Additional context Issue looks similar to bug #173 reported back on v.1.7

Haloooch commented 2 years ago

Just noticed the question in 173, I'm not running LetsEncrypt, Duck DNS or NGNIX. I do however have letsdnsocloud.

Full list of add-ons are: Actron Air Conditioner (2022.2.1), AdGuard Home (4.5.1), AppDaemon (0.8.2), BlueRiiot2MQTT (0.14.0), Check Home Assistant configuration (3.10.0), File editor (5.3.3), Glances (0.15.0), Home Assistant Google Drive Backup (0.107.2), MariaDB (2.4.0), Mosquitto broker (6.1.2), Node-RED (11.1.2), SSH & Web Terminal (10.1.3), Samba share (9.6.1), Studio Code Server (5.0.4), deCONZ (6.13.0), letsdnsocloud (1.1), phpMyAdmin (0.7.1), CompreFace (0.6.1), Double Take (beta) (1.6.0), Frigate NVR (3.1), Portainer (2022.5.0), Double Take (1.10.0)

robert1993 commented 2 years ago

Same issue here. Not running LetsEncrypt, just Nabu Casa.

`auth: undefined, hostname: 'analytics.jako.io', port: 443, nativeProtocols: { 'http:': [Object], 'https:': [Object] }, pathname: '/api/event', _defaultAgent: Agent { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: [Object: null prototype], requests: [Object: null prototype] {}, sockets: [Object: null prototype], freeSockets: [Object: null prototype] {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, scheduling: 'lifo', maxTotalSockets: Infinity, totalSocketCount: 1, maxCachedSessions: 100, _sessionCache: [Object],

      },
      host: 'analytics.jako.io',
      servername: 'analytics.jako.io',
      _agentKey: 'analytics.jako.io:443:::::::::::::::::::::',
      encoding: null,
      singleUse: true
    }
  },
  _header: 'POST /api/event HTTP/1.1\r\n' +
    'Accept: application/json, text/plain, */*\r\n' +
    'Content-Type: application/json\r\n' +
    'User-Agent: axios/0.27.2\r\n' +
    'Content-Length: 137\r\n' +
    'Host: analytics.jako.io\r\n' +
    'Connection: close\r\n' +
    '\r\n',
  _keepAliveTimeout: 0,
  _onPendingData: [Function: nop],
  agent: Agent {
    _events: [Object: null prototype] {
      free: [Function (anonymous)],
      newListener: [Function: maybeEnableKeylog]
    },
    _eventsCount: 2,
    _maxListeners: undefined,
    defaultPort: 443,
    protocol: 'https:',
    options: [Object: null prototype] { path: null },
    requests: [Object: null prototype] {},
    sockets: [Object: null prototype] {
      'analytics.jako.io:443:::::::::::::::::::::': [ [TLSSocket] ]
    },
    freeSockets: [Object: null prototype] {},
    keepAliveMsecs: 1000,
    keepAlive: false,
    maxSockets: Infinity,
    maxFreeSockets: 256,
    scheduling: 'lifo',
    maxTotalSockets: Infinity,
    totalSocketCount: 1,
    maxCachedSessions: 100,
    _sessionCache: { map: {}, list: [] },
    [Symbol(kCapture)]: false
  },
  socketPath: undefined,
  method: 'POST',
  maxHeaderSize: undefined,
  insecureHTTPParser: undefined,
  path: '/api/event',
  _ended: false,
  res: null,
  aborted: false,
  timeoutCb: null,
  upgradeOrConnect: false,
  parser: null,
  maxHeadersCount: null,
  reusedSocket: false,
  host: 'analytics.jako.io',
  protocol: 'https:',
  _redirectable: [Circular *3],
  [Symbol(kCapture)]: false,
  [Symbol(kNeedDrain)]: false,
  [Symbol(corked)]: 0,
  [Symbol(kOutHeaders)]: [Object: null prototype] {
    accept: [ 'Accept', 'application/json, text/plain, */*' ],
    'content-type': [ 'Content-Type', 'application/json' ],
    'user-agent': [ 'User-Agent', 'axios/0.27.2' ],
    'content-length': [ 'Content-Length', 137 ],
    host: [ 'Host', 'analytics.jako.io' ]
  }
},
_currentUrl: 'https://analytics.jako.io/api/event',
[Symbol(kCapture)]: false

} }`

Problems started after upgrading to 1.10 (from 1.9). Only thing showing is '502: Bad Gateway'

jakowenko commented 2 years ago

Hey, sorry about that. I've identified the issue and pushing a beta build right now. If one of you are able to verify it fixes your issue I can merge it into the stable release.

jakowenko commented 2 years ago

There is a new beta add-on with a fix. Are you able to verify this resolves your issue?

The stable version is building now and should be ready in about an hour. I will update this when I've pushed the update to the add-on version.

jakowenko commented 2 years ago

@robert1993 your issue should be fixed with v1.10.1. Let me know if it works for you.

I believe the original issue still exists and is unrelated to what you had. So I will keep this open for now.

Haloooch commented 2 years ago

Hi, I just installed v1.10.1 and can confirm, as you suspected the issue I'm experiencing continues.

jakowenko commented 2 years ago

Hi, I just installed v1.10.1 and can confirm, as you suspected the issue I'm experiencing continues.

I'll continue to debug. I believe it has to do with making the secrets.yml file editable in v1.7.0. But it's weird my HA add-on hasn't had any issues.

kanemari commented 2 years ago

fwiw I have this issue too. I think it is related to permissions as I had noticed even on v1.6 my secrets.yml file gets wiped on reboot so I have had to hard code the API keys into the config. tested the 1.10 upgrade for the first time in a while today but had actually been keeping it intentionally on 1.6 for some time now for this reason.

Haloooch commented 2 years ago

This may or may not be helpful, I do not use secrets.yaml, I do have one but it's blank.

jakowenko commented 2 years ago

I wonder if I should not use the HA secrets.yml file and just create a Double Take specific one for HA installs. I'll mess with some of the permissions on my setup and see if I can at least reproduce the error.

jakowenko commented 2 years ago

Is it possible for either of you to check what the permissions of your secrets.yaml file is?

Mine currently looks like this on my Hass install.

2022-05-27 at 22 33 50@2x
Haloooch commented 2 years ago

Mine looks the same: -rw-r--r-- 1 root root 158 May 28 12:07 secrets.yaml

This may or may not be relevant. While I have a do have a secrets.yaml, I don't actually use it.

kanemari commented 2 years ago

i dont actually have one - which is interesting as when selecting secrets.yaml from within double-take config there is a line item there which says-

# Use this file to store secrets like usernames and passwords
# Learn more at https://github.com/jakowenko/double-take/#storing-secrets
some_password: welcome

I could be looking in the wrong spot since I am using hassio supervisor install on top of ubuntu rather than using hassOS. the path I am in is /usr/share/hassio/homeassistant/double-take - which has the config.yaml file but no secrets.yaml file

Haloooch commented 2 years ago

@kanemari it should be in the HA config folder, so same folder as your main HA configuration.yaml

kanemari commented 2 years ago

yep the folder I listed is just a docker mapping to the main home assistant config folder. I didnt have a secrets.yml file in there, and when I created one it still doesnt save anything to it. config.yml is in the same place and that one updates fine and has the same permissions.

jakowenko commented 2 years ago

I published a new version that should allow you to change some of the paths files are saved. The default is /config for the SECRETS_PATH, but can you try updating it to /config/double-take to see it resolves your issue?

2022-05-30 at 10 28 40@2x
Haloooch commented 2 years ago

I've upgraded to 1.11.0, checked the config options and it was all correct. Unfortunately, however same issues, double-take runs for a few seconds then stops.

I decided to take a drastic step and uninstalled double-take and double-take beta that I had installed. Renamed the associated folders, media and config then restarted HA. Installed 1.11.0, same issue :( Here are the logs, nothing that I can see:

info: Double Take v1.11.0-0c3965b verbose: { telemetry: true, auth: false, detect: { match: { save: true, base64: false, confidence: 97, purge: 36, min_area: 1000 }, unknown: { save: true, base64: false, confidence: 40, purge: 8, min_area: 0 } }, time: { timezone: 'Australia/Sydney', format: 'F' }, frigate: { attempts: { latest: 5, snapshot: 0, mqtt: true, delay: 0 }, image: { height: 500 }, labels: [ 'person' ], update_sub_labels: false, url: 'http://192.168.1.166:5000' }, mqtt: { topics: { frigate: 'frigate/events', matches: 'double-take/matches', cameras: 'double-take/cameras', homeassistant: 'homeassistant' }, host: '192.168.1.166:1883', username: 'MQTTDoubleTake', password: '********' }, logs: { level: 'debug' }, ui: { path: '', theme: 'bootstrap4-dark-blue', editor: { theme: 'nord_dark' }, logs: { lines: 500 }, pagination: { limit: 50 }, thumbnails: { quality: 95, width: 500 } }, server: { port: 3000 }, purge: { matches: 24, unknown: 24 }, detectors: { compreface: { det_prob_threshold: 0.8, timeout: 15, url: 'http://192.168.1.166:8000', key: '********' } }, storage: { path: '/config/double-take', config: { path: '/config/double-take' }, secrets: { path: '/config', extension: 'yaml' }, media: { path: '/media/double-take' }, tmp: { path: '/dev/shm/double-take' } }, version: '1.11.0-0c3965b' } info: MQTT: connected info: MQTT: subscribed to frigate/events, frigate/+/person/snapshot verbose: 1653951402.948736-pq7dvh - car label not in (person) verbose: 1653951468.347143-p58jv7 - car label not in (person) verbose: 1653951388.338002-vxiogc - car label not in (person) verbose: 1653951467.944499-9dwv4g - car label not in (person) verbose: 1653951312.780486-khjrek - car label not in (person) verbose: 1653951402.948736-pq7dvh - car label not in (person) verbose: 1653951468.347143-p58jv7 - car label not in (person) verbose: 1653951388.338002-vxiogc - car label not in (person)

Looking at the supervisor logs I see the following in relation to double-take:

22-05-31 08:56:09 INFO (MainThread) [supervisor.addons] Creating Home Assistant add-on data folder /data/addons/data/c7657554_double-take 22-05-31 08:56:09 INFO (SyncWorker_8) [supervisor.docker.addon] Starting build for c7657554/amd64-addon-double-take:1.11.0 22-05-31 08:56:17 INFO (SyncWorker_8) [supervisor.docker.addon] Build c7657554/amd64-addon-double-take:1.11.0 done 22-05-31 08:56:17 INFO (MainThread) [supervisor.addons] Add-on 'c7657554_double-take' successfully installed 22-05-31 08:56:28 INFO (SyncWorker_5) [supervisor.docker.addon] Starting Docker add-on c7657554/amd64-addon-double-take with version 1.11.0 22-05-31 08:57:17 INFO (SyncWorker_0) [supervisor.docker.interface] Cleaning addon_c7657554_double-take application 22-05-31 08:57:17 INFO (SyncWorker_0) [supervisor.docker.addon] Starting Docker add-on c7657554/amd64-addon-double-take with version 1.11.0 22-05-31 08:57:53 INFO (SyncWorker_4) [supervisor.docker.interface] Cleaning addon_c7657554_double-take application 22-05-31 08:57:53 INFO (SyncWorker_4) [supervisor.docker.addon] Starting Docker add-on c7657554/amd64-addon-double-take with version 1.11.0 22-05-31 08:58:03 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:58:04 INFO (MainThread) [supervisor.auth] Auth request from 'core_mosquitto' for 'MQTTDoubleTake' 22-05-31 08:58:04 INFO (MainThread) [supervisor.auth] Successful login for 'MQTTDoubleTake' 22-05-31 08:58:54 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:58:54 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:00 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:09 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:09 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:18 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:18 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:27 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:36 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:39 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:42 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)] 22-05-31 08:59:42 ERROR (MainThread) [supervisor.api.ingress] Ingress error: Cannot connect to host 172.30.33.9:3000 ssl:default [Connect call failed ('172.30.33.9', 3000)]

Silly question, I notice it's referencing amd64-addon-double-take, whereas I'm on an Intel chipset (running on a NUC), it couldn't have anything to do with it? Grasping at straws now :)

Haloooch commented 2 years ago

One other thing I have noticed, double-take seems to be using the following as it's temp folder:

tmp: { path: '/dev/shm/double-take' }

Using Studio Code Server to browse the HAOS folders this folder doesn't seem to exist, I have /dev/shm/ but not the double-take folder. It is possible that's down to HAOS permissions with Studio Code just not letting me see it.

jakowenko commented 2 years ago

Silly question, I notice it's referencing amd64-addon-double-take, whereas I'm on an Intel chipset (running on a NUC), it couldn't have anything to do with it? Grasping at straws now :)

amd64 should be the correct version since you're on a NUC. Thanks for the logs. I'm determined to figure this out, I just wish I could replicate it haha.

One other thing I have noticed, double-take seems to be using the following as it's temp folder:

tmp: { path: '/dev/shm/double-take' }

Using Studio Code Server to browse the HAOS folders this folder doesn't seem to exist, I have /dev/shm/ but not the double-take folder. It is possible that's down to HAOS permissions with Studio Code just not letting me see it.

That's a good theory, I can look into that more or create a new build without that for now for testing.

Edit: On second thought, that folder will only get created when files are processed. So if it's not even loading, chances are that folder won't exist.

So even with the /config/double-take for your SECRETS_PATH it still fails right? Did any of the files get created within config/double-take?

Haloooch commented 2 years ago

And we are fixed!

In the new config for 1.11.0 I had the secrets setup like this: SECRETS_PATH: /config My main HA secrets.yaml did live in the /config folder.

I decided to move a copy of the secrets file to /config/double-take and updated the config to match. SECRETS_PATH: /config/double-take

Double-take has been running for 5 minutes and is capturing images as expected! :)

kanemari commented 2 years ago

it works when the path for the secrets file is changed to /config/double-take.

I simply stopped the prod version, installed the beta, updated the path to match all the others, and started it. works fine so far. it would have crashed out almost instantly before so I assume that its going to be fine.

checked that I can edit the secrets file (from within the double-tale directory), and it updates properly on the filesystem as well.

jakowenko commented 2 years ago

Awesome! Sorry if I wasn't clear with what to change originally.

I'm so glad it's working for both of you now. I didn't want to change the default since it does appear to be working for some. But I will include a note in the README to let users know about it if they are getting a similar error.

Thanks for being patient on this fix!

jakowenko commented 2 years ago

Going to close since this seems to be resolved now. Feel free to open a new issue if you run into any problems.

robert1993 commented 2 years ago

That solved the problem. I’ve just been able to install and launch 1.10.1, working as expected.

Op 25 mei 2022 om 17:43 heeft David Jakowenko @.***> het volgende geschreven:

 TotaalHost reports: This sender is trusted. There is a new beta add-on with a fix. Are you able to verify this resolves your issue?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

4xle commented 2 years ago

This may be an issue depending on the version of HA being run or not. I'm running HA Supervised and it is especially protective of the /config directory, it seems. Same issue as described above with DT @ v1.13.0, changing the secrets path solved it.