home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.51k stars 30.71k forks source link

Recorder Issues with External (Remote) MySQL #33546

Closed nilsmau closed 4 years ago

nilsmau commented 4 years ago

The problem

Recorder connects to DB (hosted on web server) at Hassio start-up and creates required tables (events, recorder_runs, schema_changes, states) And creates entries (1 each) into: recorder_runs schema_changes After that, no further entry is created. And error MySQL 2013 (see below) appears in log. Restart of Home Assistant or System (HassOS) shows no effect. It used to work before, DB usually contained appr. 4 weeks of history.

Environment

Problem-relevant configuration.yaml

info: all IP addresses, URLs, Names etc. replaced with placeholders all referenced .yaml files are empty

homeassistant:
  # Name of the location where Home Assistant is running
  name: Home
  # Location required to calculate the time the sun rises and sets
  latitude: 2.471400
  longitude: 3.609040
  # Impacts weather/sunrise data (altitude above sea level in meters)
  elevation: 0
  # metric for Metric, imperial for Imperial
  unit_system: metric
  # Pick yours from here: http://en.wikipedia.org/wiki/List_of_tz_database_time_zones
  time_zone: Europe/Amsterdam
  # Customization file
  customize: !include customize.yaml
  whitelist_external_dirs:
    - /share
# Show links to resources in log and frontend
# introduction:

# Enables the frontend
frontend:
  javascript_version: es5

# Enables configuration UI
config:

mobile_app:

recorder:
  commit_interval: 10
  db_url: mysql://user:password@url:3306/DB?charset=utf8 
  include:
    domains:
      - light
      - sensor
      - switch
      - octoprint
      - person
      - binary_sensor

http:
  # Secrets are defined in the file secrets.yaml
  # api_password: !secret http_password
  # Uncomment this if you are using SSL/TLS, running in Docker container, etc.
  # base_url: 
  # ssl_certificate: /ssl/fullchain.pem
  # ssl_key: /ssl/privkey.pem

# Checks for available updates
# Note: This component will send some information about your system to
# the developers to assist with development of Home Assistant.
# For more information, please see:
# https://home-assistant.io/blog/2016/10/25/explaining-the-updater/
updater:
  # Optional, allows Home Assistant developers to focus on popular components.
  # include_used_components: true

# Discover some devices automatically
discovery:
  ignore:
    - apple_tv

# Allows you to issue voice commands from the frontend in enabled browsers
conversation:

logger:
  default: warn
  logs:
    homeassistant.components.recorder: debug
stream:

ffmpeg:

# Enables support for tracking state changes over time
history:

# View all events in a logbook
logbook:

system_health:

octoprint:
  host: IPADDRESS
  api_key: APIKEY
  bed: true
  number_of_tools: 2

camera:
  - platform: mjpeg
    name: OctoPrint
    still_image_url: http:/IPADDRESS/webcam/?action=snapshot
    mjpeg_url: http://IPADDRESS/webcam/?action=stream
  - platform: ffmpeg
    name: Garderobencam
    input: -rtsp_transport tcp -i rtsp://user:password@IP:8554/unicast

ios:

# Enables a map showing the location of tracked devices
map:

# Track the sun
sun:

# Weather prediction
sensor:
  - platform: yr
  - platform: systemmonitor
    resources:
      - type: disk_use_percent
        arg: /home
      - type: disk_use
        arg: /home
      - type: memory_free
      - type: processor_use
  - platform: template
    sensors:
      washing_machine_status_2:
        value_template: '{{ states.input_select.washing_machine_status.state}}'
        friendly_name: 'Washing Machine Status'
      wasserverbrauch:
        value_template: '{{(states("sensor.wasserpumpe_an") | float *2520 )| round(1)}}'
        friendly_name: "Wasserverbrauch"
        unit_of_measurement: "ltr"
        icon_template: mdi:water
  - platform: history_stats
    name: Wasserpumpe An
    entity_id: binary_sensor.wasserpumpe_running
    state: 'on'
    type: time
    start: '{{ now().replace(hour=0).replace(minute=0).replace(second=0) }}'
    end: '{{ now() }}'

binary_sensor:
  - platform: template
    sensors:
      washing_machine_is_finished:
        entity_id:
          - sensor.washing_machine_status_2
        device_class: occupancy
        friendly_name: Waschmaschine fertig
        value_template: '{{ is_state("input_select.washing_machine_status", "Clean") }}'
      wasserpumpe_running:
        icon_template: mdi:water-pump
        value_template: '{{ states("sensor.wasserpumpe_power")|float > 0 }}'

# Text to speech
tts:
  - platform: google_translate
    service_name: google_say

# Cloud
# cloud:

# Hello World component
#load_locations:

switch:
- platform: command_line
  switches:
    shut_down_octoprint:
      friendly_name: "Shut Down Octoprint"
      command_on: "curl -s http://IP/api/system/commands/core/shutdown -d -H 'Content-Type: application/json' -H 'X-Api-Key: APIKEY'"
      command_off: ""
      value_template: "{{ not is_state('sensor.octoprint_current_state', 'printing') }}"

homekit:
  auto_start: False
  filter:
    include_entities:
      - sensor.temperature_gewaechshaus
      - sensor.temperature_schlafzimmer
      - sensor.temperature_waschkeller
      - sensor.temperature_wohnzimmer
      - sensor.temperature_3d_printer_smoke_sensor
      - sensor.humidity_gewaechshaus
      - sensor.humidity_schlafzimmer
      - sensor.humidity_waschkeller
      - sensor.humidity_wohnzimmer
      - binary_sensor.octoprint_printing
      - binary_sensor.presence_garderobe
      - binary_sensor.presence_gaestezimmer
      - binary_sensor.washing_machine_is_finished
      - binary_sensor.3d_printer_smoke_sensor
      - light.kartell
      - light.schreibtisch
      - light.schlafzimmer_stehlampe
      - light.bunkerlampe_1
      - light.bunkerlampe_2
      - light.switch_1
      - light.switch_9
      - light.wohnzimmerlampe
      - light.level
      - light.level_2
      - light.fibaro_system_fgd212_dimmer_2_level_4
      - light.fibaro_system_fgd212_dimmer_2_level_6
      - light.ananas
      - light.licht_kueche_level
      - switch.papagei
      - switch.schiff_ahoi
      - switch.schwibbogen
      - climate.test_thermostat

person:
  - name: personname
    id: id
    device_trackers:
      - device_tracker.name
      - device_tracker.name_2
  - name: name2
    id: name2
    device_trackers:
      - device_tracker.personname2

input_select:
  washing_machine_status:
    name: Status Waschmaschine
    options:
      - Idle
      - Running
      - Finishing
      - Clean
    initial: Idle

wemo:

mqtt:
  broker: 192.168.2.101 # <- this is Hassio's IP address

device_tracker:
  - platform: fritz
    host: IP
    consider_home: 30
    new_device_defaults:
      track_new_devices: true

zwave:
  usb_path: /dev/ttyACM0
  network_key: "NETWORKKEY"

weather:
  - platform: met

group: !include groups.yaml
automation: !include automations.yaml
script: !include scripts.yaml

Traceback/Error logs

020-04-02 20:32:30 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (MySQLdb._exceptions.OperationalError) (2013, 'Lost connection to MySQL server during query')
[SQL: SELECT events.event_id AS events_event_id, events.event_type AS events_event_type, events.event_data AS events_event_data, events.origin AS events_origin, events.time_fired AS events_time_fired, events.created AS events_created, events.context_id AS events_context_id, events.context_user_id AS events_context_user_id 
FROM events 
 LIMIT %s]
[parameters: (1,)]
(Background on this error at: http://sqlalche.me/e/e3q8)
2020-04-02 20:32:30 DEBUG (SyncWorker_4) [homeassistant.components.recorder.util] converting 0 rows to native objects took 0.156105s
2020-04-02 20:32:33 DEBUG (SyncWorker_15) [homeassistant.components.recorder.util] converting 0 rows to native objects took 0.091978s

Additional information

bdraco commented 4 years ago

@nilsmau 'Lost connection to MySQL server during query'

Is there anything interesting in your mysql server logs?

nilsmau commented 4 years ago

@bdraco I am unable to access the MySQL server logs (hoster says its impossible for a shared server - only logging errors). Using the same (Hassio) user credentials for logging in and querying the DB, directly or remotely shows no issue. Waiting for further feedback from hoster, as I now specifically asked for error logs. ! Bear with me...

nilsmau commented 4 years ago

Hassio reboot didn't change anything. More errors appear in log today. maybe those help:

[SQL: SELECT states.state_id AS states_state_id, states.domain AS states_domain, states.entity_id AS states_entity_id, states.state AS states_state, states.attributes AS states_attributes, states.event_id AS states_event_id, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.created AS states_created, states.context_id AS states_context_id, states.context_user_id AS states_context_user_id 
FROM states 
WHERE states.last_updated >= %s AND states.last_updated < %s AND states.entity_id IN (%s) ORDER BY states.last_updated DESC 
 LIMIT %s]
[parameters: (datetime.datetime(2020, 4, 2, 18, 31, 21, 129687, tzinfo=<UTC>), datetime.datetime(2020, 4, 2, 22, 0, 0, 12415, tzinfo=<UTC>), 'binary_sensor.wasserpumpe_running', 1)]
(Background on this error at: http://sqlalche.me/e/e3q8)
2020-04-03 05:55:56 ERROR (SyncWorker_2) [homeassistant.components.recorder.util] Error executing query: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
[SQL: SELECT states.state_id AS states_state_id, states.domain AS states_domain, states.entity_id AS states_entity_id, states.state AS states_state, states.attributes AS states_attributes, states.event_id AS states_event_id, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.created AS states_created, states.context_id AS states_context_id, states.context_user_id AS states_context_user_id 
FROM states 
WHERE states.last_updated >= %s AND states.last_updated < %s AND states.entity_id IN (%s) ORDER BY states.last_updated DESC 
 LIMIT %s]
[parameters: [immutabledict({})]]
2020-04-03 05:55:56 ERROR (SyncWorker_2) [homeassistant.components.recorder.util] Error executing query: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
[SQL: SELECT states.state_id AS states_state_id, states.domain AS states_domain, states.entity_id AS states_entity_id, states.state AS states_state, states.attributes AS states_attributes, states.event_id AS states_event_id, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.created AS states_created, states.context_id AS states_context_id, states.context_user_id AS states_context_user_id 
FROM states 
WHERE states.last_updated >= %s AND states.last_updated < %s AND states.entity_id IN (%s) ORDER BY states.last_updated DESC 
 LIMIT %s]
[parameters: [immutabledict({})]]
2020-04-03 05:55:56 ERROR (SyncWorker_2) [homeassistant.components.recorder.util] Error executing query: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
[SQL: SELECT states.state_id AS states_state_id, states.domain AS states_domain, states.entity_id AS states_entity_id, states.state AS states_state, states.attributes AS states_attributes, states.event_id AS states_event_id, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.created AS states_created, states.context_id AS states_context_id, states.context_user_id AS states_context_user_id 
FROM states 
WHERE states.last_updated >= %s AND states.last_updated < %s AND states.entity_id IN (%s) ORDER BY states.last_updated DESC 
 LIMIT %s]
[parameters: [immutabledict({})]]
2020-04-03 05:55:56 ERROR (MainThread) [homeassistant.helpers.entity] Update for sensor.wasserpumpe_an fails
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1171, in _execute_context
    conn = self._revalidate_connection()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 457, in _revalidate_connection
    "Can't reconnect until invalid "
sqlalchemy.exc.InvalidRequestError: Can't reconnect until invalid transaction is rolled back

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 279, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 476, in async_device_update
    await self.hass.async_add_executor_job(self.update)
  File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/history_stats/sensor.py", line 214, in update
    self.hass, start, end, str(self._entity_id)
  File "/usr/src/homeassistant/homeassistant/components/history/__init__.py", line 114, in state_changes_during_period
    return states_to_json(hass, states, start_time, entity_ids)
  File "/usr/src/homeassistant/homeassistant/components/history/__init__.py", line 240, in states_to_json
    for state in get_states(hass, start_time, entity_ids, filters=filters):
  File "/usr/src/homeassistant/homeassistant/components/history/__init__.py", line 214, in get_states
    for state in execute(query)
  File "/usr/src/homeassistant/homeassistant/components/recorder/util.py", line 66, in execute
    row for row in (row.to_native() for row in qry) if row is not None
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3389, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3414, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
    distilled_params,
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1176, in _execute_context
    e, util.text_type(statement), parameters, None, None
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1171, in _execute_context
    conn = self._revalidate_connection()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 457, in _revalidate_connection
    "Can't reconnect until invalid "
sqlalchemy.exc.StatementError: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
[SQL: SELECT states.state_id AS states_state_id, states.domain AS states_domain, states.entity_id AS states_entity_id, states.state AS states_state, states.attributes AS states_attributes, states.event_id AS states_event_id, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.created AS states_created, states.context_id AS states_context_id, states.context_user_id AS states_context_user_id 
FROM states 
WHERE states.last_updated >= %s AND states.last_updated < %s AND states.entity_id IN (%s) ORDER BY states.last_updated DESC 
 LIMIT %s]
[parameters: [immutabledict({})]]

and it is still doing

2020-04-03 05:56:27 DEBUG (SyncWorker_12) [homeassistant.components.recorder.util] converting 0 rows to native objects took 0.155605s
2020-04-03 05:56:27 DEBUG (SyncWorker_12) [homeassistant.components.recorder.util] converting 0 rows to native objects took 0.118823s
nilsmau commented 4 years ago

DB Server information:

Server: Localhost via UNIX socket Server-Typ: MariaDB Server-Verbindung: SSL wird nicht verwendet Dokumentation Server-Version: 10.2.25-MariaDB-10.2.25+maria~jessie-log - mariadb.org binary distribution Protokoll-Version: 10 Benutzer: USERNAME@localhost Server-Zeichensatz: UTF-8 Unicode (utf8)

bdraco commented 4 years ago

What are the

wait_timeout = interactive_timeout = values set to in your server's my.cnf?

nilsmau commented 4 years ago

This is what I get for those two when querying via MySQL...:

wait_timeout = 60
interactive_timeout = 60

While I can read my.cnf, I cannot edit it (no permissions):

[mysql]
socket          = /var/lib/mysql/mysqld.sock
[mysqldump]
socket          = /var/lib/mysql/mysqld.sock

I am unable to modify GLOBAL wait_timeout via MySQL (permissions again), yet, I am able to set SESSION wait_timeout with

SET session wait_timeout=300;
bdraco commented 4 years ago

With a timeout that low the system is not going to be able to recover before it gets disconnected again.

The default timeout for MySQL is 8 hours. Is there a reason you have the timeout set that low?

nilsmau commented 4 years ago

@bdraco, glad to see there is a PR to fix this from the Home Assistant side of things (I was about to propose the session timeout setting for HA). I have in parallel requested at my hoster to raise the wait_timeout. So either way, I am hopeful there is a solution to this. (the local home assistant db on the SD card of my PI has ruined two SD cards already, thus, I would like to keep using a remote (hosted) DB) Thanks!

bdraco commented 4 years ago

(the local home assistant db on the SD card of my PI has ruined two SD cards already, thus, I would like to keep using a remote (hosted) DB) Thanks!

The irony is that the change that was made to increase SD card life is what is causing this issue to appear.

nilsmau commented 4 years ago

Do I need to do anything on #33638 to push this into the next HA release? Did I miss something?

bdraco commented 4 years ago

@nilsmau Its waiting for someone to have time to review the fix.

bdraco commented 4 years ago

@nilsmau if you want to try it out, you can checkout the branch and replace your recorder with this code as /config/custom_components/recorder/

nilsmau commented 4 years ago

@bdraco testing now as custom component

nilsmau commented 4 years ago

@bdraco works! thank you! Your support and your contribution in HA is much appreciated!