home-assistant / addons

:heavy_plus_sign: Docker add-ons for Home Assistant
https://home-assistant.io/hassio/
Apache License 2.0
1.51k stars 1.47k forks source link

There is an bug #1740

Closed zambujal closed 3 years ago

zambujal commented 3 years ago

The problem

Home assistants stalls, sometimes every hour (I have many sensors), then mqtt mosquito disconnects everything, this is new, cause I had homeassistant running for months, without any problems, I can live with this weird stalling... but I can't live with everything blocked after, I am asking if there is an routine that detects the stalling, and restarts the mosquito, even if it's on...and the other plugins... ? it's possible?

It's not hardware...HASS continues to work nicely... apparently until the next hickup... I changed everything not the raspberry... mind you.

Environment

raspberry pi 3 b+. last Hassio with an hard disc from usb as media.

Problem-relevant configuration

Traceback/Error logs

This is just not also for mqtt...samba server... after the stalling:

desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) closed connection to service config
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 421 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 633 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 774 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 778 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 780 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 782 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 784 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 786 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 788 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 790 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 792 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 794 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 796 -- ignoring
root opened file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv read=No write=No (numopen=5)
root closed file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv (numopen=4) NT_STATUS_OK
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 798 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 800 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 802 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 804 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 806 -- ignoring
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 808 -- ignoring
root opened file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv read=No write=No (numopen=5)
root closed file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv (numopen=4) NT_STATUS_OK
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 810 -- ignoring
root opened file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv read=No write=No (numopen=5)
root closed file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv (numopen=4) NT_STATUS_OK
root closed file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv (numopen=3) NT_STATUS_OK
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 812 -- ignoring
idmap range not specified for domain '*'
desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) connect to service config initially as user root (uid=0, gid=0) (pid 294)
root opened file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv read=Yes write=No (numopen=4)
desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) closed connection to service config
root closed file BBC.Horizon.2013.The.Truth.About.Meteors.720p.HDTV.x264.AAC.MVGroup.org.mkv (numopen=3) NT_STATUS_OK
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_POOL_USAGE
Could not find child 815 -- ignoring
Failed to fetch record!
pcap cache not loaded
Failed to fetch record!
pcap cache not loaded
Failed to fetch record!
pcap cache not loaded
idmap range not specified for domain '*'
desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) connect to service config initially as user root (uid=0, gid=0) (pid 294)
desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) closed connection to service config
idmap range not specified for domain '*'
desktop-jh838kk (ipv6:fe80::6995:b974:61c2:f596:49227) connect to service config initially as user root (uid=0, gid=0) (pid 294)
-->

For mosquito:
log: [19:44:03] INFO: Setup mosquitto configuration
[19:44:03] WARNING: SSL not enabled - No valid certs found!
[19:44:03] INFO: No local user available
[19:44:05] INFO: Initialize Hass.io Add-on services
[19:44:05] INFO: Initialize Home Assistant discovery
[19:44:05] INFO: Start Mosquitto daemon
1609440245: mosquitto version 1.6.3 starting
1609440245: Config loaded from /etc/mosquitto.conf.
1609440245: Loading plugin: /usr/share/mosquitto/auth-plug.so
1609440245: |-- *** auth-plug: startup
1609440245:  ├── Username/password checking enabled.
1609440245:  ├── TLS-PSK checking enabled.
1609440245:  └── Extended authentication not enabled.
1609440245: Opening ipv4 listen socket on port 1883.
1609440245: Opening ipv6 listen socket on port 1883.
1609440245: Opening websockets listen socket on port 1884.
1609440245: Warning: Mosquitto should not be run as root/administrator.
1609440245: New connection from 192.168.1.10 on port 1883.
[INFO] found powervr on Home Assistant
1609440247: New client connected from 192.168.1.10 as ESP8266Client (p2, c1, k15, u'powervr').
1609440247: New connection from 172.30.33.0 on port 1883.
1609440247: New client connected from 172.30.33.0 as mqttjs_909c65e5 (p2, c1, k60, u'powervr').
1609440249: New connection from 172.30.32.1 on port 1883.
[INFO] found homeassistant on local database
1609440250: New client connected from 172.30.32.1 as 1fqRhcy7BQ6CiSQBsQwtVb (p2, c1, k60, u'homeassistant').
1609440254: New connection from 192.168.1.11 on port 1883.
1609440254: New client connected from 192.168.1.11 as DVES_26433B (p2, c1, k30, u'powervr').
/bin/auth_srv.sh: line 15: echo: write error: Broken pipe
/bin/auth_srv.sh: line 15: echo: write error: Broken pipe
/bin/auth_srv.sh: line 15: echo: write error: Broken pipe
1609441065: Client 1fqRhcy7BQ6CiSQBsQwtVb has exceeded timeout, disconnecting.

And then nothing...

Additional information

I do not have great knowledge in programming, just the little arduino code... a little...

zambujal commented 3 years ago

The problem is after the stalling everything went weird... I have to restart everything... I have 10 amp for the hard disc, and the raspberry... (hard disc with own alimentation of 5v.

Many thanks... this couldn't be just me... this is maybe when saving historics or something... my hard drive is slow... not ssd, so this is even intensifies the problem... not working but it does work there is not that multitasking... something in the lines when everything want the same thing at the same time.

zambujal commented 3 years ago

will put here my configuration.yaml:

# Configure a default setup of Home Assistant (frontend, api, etc) this was working ok before...
default_config:

# Text to speech
tts:
  - platform: google_translate

group: !include groups.yaml
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
octoprint:
  host: 192.168.1.39
  api_key: xxxxxxxxxxxxxxxxxxxxxxxx
sensor:
  - platform: mqtt
    name: "Temperature sump"
    state_topic: "sensor/Temperatura Sump"
    unit_of_measurement: '°C'
  - platform: mqtt
    name: "Humidade sump"
    state_topic: "sensor/Humidade sump"
    unit_of_measurement: '%'
    icon: mdi:water-percent
  - platform: mqtt
    name: "Temperatura água"
    state_topic: "sensor/Temperatura água"
    unit_of_measurement: '°C'
    icon: mdi:coolant-temperature
  - platform: mqtt
    name: "Estado"
    state_topic: "sensor/estado"
  - platform: mqtt
    name: "Potência Violeta"
    state_topic: "sensor/Potência luz"
    unit_of_measurement: '%'
  - platform: mqtt
    name: "Potência Branco"
    state_topic: "sensor/Potência luz2"
    unit_of_measurement: '%'
  - platform: mqtt
    name: "Potência Azul"    
    state_topic: "sensor/Potência luz3"
    unit_of_measurement: '%'
  - platform: mqtt
    name: "Horas"
    state_topic: "sensor/Horas"
  - platform: mqtt
    name: "Minutos"
    state_topic: "sensor/Minutos"
  - platform: mqtt
    name: "Segundos"    
    state_topic: "sensor/Segundos"
  - platform: mqtt
    name: "Bomba de reposição"
    state_topic: "sensor/Bomba de reposição"
  - platform: mqtt
    name: "Frasco escumador"    
    state_topic: "sensor/frasco escumador"
  - platform: mqtt
    name: "pH água"    
    state_topic: "sensor/pH água"
    unit_of_measurement: 'º'
    icon: mdi:test-tube
  - platform: enphase_envoy
    name: enphase_new
#     username: xxxxx
#     password: xxxxx
    ip_address: 192.168.1.15
    scan_interval: 5
    monitored_conditions:
      - production
      - consumption
      - daily_production
      - seven_days_production
      - lifetime_production
      - inverters
      # Calculate Remaining Power
  - platform: template
    sensors:
      remaining_power:
        value_template: >
          {{ '%0.1f' | format(states('sensor.enphase_newenvoy_current_energy_consumption') | float - 
                              states('sensor.enphase_newenvoy_current_energy_production') | float) }}
        unit_of_measurement: 'W'
        friendly_name: Energia importada
zambujal commented 3 years ago

I will post more on this after...

When I restart the mosquito all is nice and dandy... until 1-2 hours later...

[20:32:34] INFO: Setup mosquitto configuration [20:32:34] WARNING: SSL not enabled - No valid certs found! [20:32:34] INFO: No local user available [20:32:35] INFO: Initialize Hass.io Add-on services [20:32:35] INFO: Initialize Home Assistant discovery [20:32:35] INFO: Start Mosquitto daemon 1609443155: mosquitto version 1.6.3 starting 1609443155: Config loaded from /etc/mosquitto.conf. 1609443155: Loading plugin: /usr/share/mosquitto/auth-plug.so 1609443155: ├── Username/password checking enabled. 1609443155: ├── TLS-PSK checking enabled. 1609443155: └── Extended authentication not enabled. 1609443155: |-- *** auth-plug: startup 1609443155: Opening ipv4 listen socket on port 1883. 1609443155: Opening ipv6 listen socket on port 1883. 1609443155: Opening websockets listen socket on port 1884. 1609443156: Warning: Mosquitto should not be run as root/administrator. 1609443156: New connection from 192.168.1.10 on port 1883. [INFO] found xxxx on Home Assistant 1609443157: New client connected from 192.168.1.10 as ESP8266Client (p2, c1, k15, u'xxxxx'). 1609443157: New connection from 172.30.33.0 on port 1883. 1609443157: New client connected from 172.30.33.0 as mqttjs_909c65e5 (p2, c1, k60, u'xxxxx').

zambujal commented 3 years ago

maybe it's just the poor old raspberry ... lack of mem?

I made an Virtual machine with homeassistant... nice... ;) working like I thought homeassistant should run! :D This machine is always on... so...why not.though I lose 10% maybe I can slow down this ... Many thanks for any reponse.

zambujal commented 3 years ago

Homeassistant works fine with the i386 kernel... until the same thing happenned... I watch in the taskbar what happened... Virtuabox was using 200 MB/s of disc ... saving something... don't know what... or reading... weird... not doing anything!

zambujal commented 3 years ago

now I have something VirtualBox_Hassos_01_01_2021_04_04_32

zambujal commented 3 years ago

so this is really the mosquitto doing some nasty stuff... 1609467604: New client connected from 172.30.33.2 as mqttjs_424b4bcf (p2, c1, k60, u'powervr'). 1609469208: Saving in-memory database to /data/mosquitto.db. 1609469862: Client mqttjs_424b4bcf has exceeded timeout, disconnecting. /bin/auth_srv.sh: line 15: echo: write error: Broken pipe /bin/auth_srv.sh: line 15: echo: write error: Broken pipe /bin/auth_srv.sh: line 15: echo: write error: Broken pipe /bin/auth_srv.sh: line 15: echo: write error: Broken pipe /bin/auth_srv.sh: line 15: echo: write error: Broken pipe /bin/auth_srv.sh: line 15: echo: write error: Broken pipe

hope the guard will relaunch it... the next time...

zambujal commented 3 years ago

the other add-on mqtt I was able to put it in service, with zigbee2mqtt... if this works... (now again in the raspberry 3b+) ...several tweaking... and I get it to work... if this works then we have an problem with the newer mqtt mosquitto add-on memory spill... ? I replicated the same problem with 2 different system i386 and raspberry pi 3b+ so this is rather normal...

Please let the mqtt and web server remain in the add-on, it's the difference from homeassistant working and not.

And we have an choise! I am not an programmer, maybe there is security issues with this older mqtt server... but it does work fine with me.

The stall still happens, every 30 minutes +/- maybe registring saving my identities data... Yes...I have a lot! ;) (could be my "problem") ... but the older mqtt addon resists... ;)

zambujal commented 3 years ago

Is I being attacked by an hacker ?!? This started to happen after I posted my system on an Maker portuguese forum DIY stuff) ...I wonder... if it's only me... I removed my hass from the internet. only accessible via inside my netowork. it's where I get contact with the homeassistant in the first place. I wonder... too much coincidences...

frenck commented 3 years ago

image ☝️ From your logs: Out of memory

You are putting too much on your Pi to handle, it runs out of memory and thus things will start to fail.

zambujal commented 3 years ago

It's closed but I will try to comment on this I hope... (Nevertheless I made an email to you) Thanks for the reply, and an happy new year!

This weird behavour stopped... it's running for 11 hours now, without an hickup. After I removed the port that exposed my homeassistant from the router to the outside world. so now my homeassistant is only local.

I tested with an virtual machine i386 with 8 gigas of ram (the virtual machine... I have a lot more ram)... so it was not because of memory... the hickups were also frequent...sometimes 4-5 minutes irresponsiveness

Today for testing purposes....I even watched an movie from the samba server of the raspberry on my pc. mkv 2016p sometimes the hickup arrive, but they take less time... most of the time no hickups (this is what I made to test the behavour of the raspberry...

still using the old mqtt that you made. I believe.... it works!

That log is from the Virtual machine with 8 gigas... not the raspberry... I used the pc to test if I needed to buy an NUC PC ... nop... it's similar, so I returned to raspberry, and it's fine now (cut from the outside... )

zambujal commented 3 years ago

still ok even with more data and more identities... :D the double!

zambujal commented 3 years ago

I tried the mosquitto normal client... and it took 2 minutes to get off the line, so it was not hackers... now I know why homeassistant forced sometimes to reboot my raspberry... it was allways the mosquito clent...

The deprecated client, is a lot more difficult to configure to work... don't know why... but it does the job 2 days without dropping... 3x the identities, and I am making some visual enhancements, like using images in homeassistant... I believe this problem is old... I saw an ton of people witrh the same thing... but... it works sometimes.,.. weird. Homeassistant doesn't crash but neither the mosquitto... it lingers ... this one is a lot more robust...

zambujal commented 3 years ago

The biggest problem is not that it failes

/bin/auth_srv.sh: line 14: echo: write error: Broken pipe /bin/auth_srv.sh: line 14: echo: write error: Broken pipe 1547936026: Socket error on client xxxxx, disconnecting.

The biggest problem is nothing reconnects after... and that is an problem.