home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.56k stars 30.74k forks source link

Recorder 30000 queue reach in two hours 2021.9 #56987

Closed egranto closed 2 years ago

egranto commented 3 years ago

The problem

I'm getting this error "The recorder queue reached the maximum size of 30000; Events are no longer being recorded"

I had a heavy logging system which was working fine previews version 2021.9.8 and OS 6.4

I had 1140 sensors logging between 1 minute and 1 hour

What is version of Home Assistant Core has the issue?

core-2021.9.7

What was the last working version of Home Assistant Core?

core-2021.8.8

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Recorder

Link to integration documentation on our website

https://www.home-assistant.io/integrations/recorder/

Example YAML snippet

recorder:
  db_url: !secret maria_db
  db_retry_wait: 20
  db_max_retries: 30  
  auto_purge: true
  purge_keep_days: 7
  commit_interval: 1
  exclude:
    entities:
      - sun.sun
    domains:
      - alarm_control_panel
      - light
      - automation
      - binary_sensor
      - updater
      - calendar
      - camera
      - device_tracker
      - group
      - image_processing
      - input_boolean
      - input_datetime
      - input_number
      - input_select
      - input_text
      - person
      - proximity
      - sun
      - utility_meter
    entity_globs:
      - sensor.*_time  
      - sensor.*time* 
      - sensor.*uptim*
      - sensor.*_current 
      - sensor.*_movimiento
    event_types:
      - call_service # Don't record service calls
      - deepstack.object_detected

Anything in the logs that might be useful for us?

2021-10-02 20:20:28 ERROR (MainThread) [homeassistant.components.recorder] The recorder queue reached the maximum size of 30000; Events are no longer being recorded

2021-10-03 04:10:06 WARNING (Thread-10785) [homeassistant.util.executor] Thread[SyncWorker_34] is still running at shutdown: File "/usr/local/lib/python3.9/threading.py", line 930, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 75, in _worker
    work_item = work_queue.get(block=True)

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 446, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 641, in async_device_update
    raise exc
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/command_line/sensor.py", line 127, in update
    self._state = self._value_template.render_with_possible_json_value(
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 527, in render_with_possible_json_value
    return run_callback_threadsafe(
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/usr/src/homeassistant/homeassistant/util/async_.py", line 58, in run_callback
    future.set_result(callback(*args))
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 557, in async_render_with_possible_json_value
    return _render_with_context(
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 1477, in _render_with_context
    return template.render(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 1304, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 1395, in regex_findall_index
    return re.findall(find, value, flags)[index]
IndexError: list index out of range

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 446, in async_update_ha_state
    await self.async_device_update()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 641, in async_device_update
    raise exc
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/src/homeassistant/homeassistant/components/command_line/sensor.py", line 127, in update
    self._state = self._value_template.render_with_possible_json_value(
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 527, in render_with_possible_json_value
    return run_callback_threadsafe(
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 438, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/usr/src/homeassistant/homeassistant/util/async_.py", line 58, in run_callback
    future.set_result(callback(*args))
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 557, in async_render_with_possible_json_value
    return _render_with_context(
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 1477, in _render_with_context
    return template.render(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 1304, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 1, in top-level template code
  File "/usr/src/homeassistant/homeassistant/helpers/template.py", line 1395, in regex_findall_index
    return re.findall(find, value, flags)[index]
IndexError: list index out of range

2021-10-02 20:51:06 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)

Additional information

data is been past trough to influxdb where i store long term data, but recorder even the default one i'm getting this error

i remove all ENERGY stuff because i was thinking it was all the statistics the system create but don't i'm still couldn't find the cause

pinpoint87 commented 2 years ago

Been reading this thread for a while in since this issue also happens on my install (HA Supervised, MariaDB 10.3.27 on A separate VM). For me this issue can happen any given time and doesn't seem to follow any patterns other than choking itself after compiling statistics or failing that SQL command (When I try to run the 'failed' SQL command it runs without any issues).

Since last Friday the recorder stopped three times, unfortunately I do not have the MariaDB log entries around the times it happened, when I have them I'll post those as well.

## December 10
2021-12-10 01:30:00 DEBUG (SyncWorker_1) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.004733s
2021-12-10 01:30:00 DEBUG (SyncWorker_13) [homeassistant.components.recorder.util] querying 1 rows took 0.033051s
2021-12-10 01:30:00 DEBUG (SyncWorker_13) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.033354s
2021-12-10 01:30:10 DEBUG (Recorder) [homeassistant.components.recorder.statistics] Compiling statistics for 2021-12-10 00:25:00+00:00-2021-12-10 00:30:00+00:00
2021-12-10 01:30:58 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 179
2021-12-10 01:40:58 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 2605
## December 12
2021-12-12 10:42:48 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 0
2021-12-12 10:44:57 DEBUG (Recorder) [homeassistant.components.recorder] Sending keepalive
2021-12-12 10:45:00 DEBUG (SyncWorker_7) [homeassistant.components.recorder.util] querying 0 rows took 0.009860s
2021-12-12 10:45:00 DEBUG (SyncWorker_7) [homeassistant.components.recorder.history] get_significant_states took 0.010171s
2021-12-12 10:45:00 DEBUG (SyncWorker_3) [homeassistant.components.recorder.util] querying 1 rows took 0.004434s
2021-12-12 10:45:00 DEBUG (SyncWorker_3) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.004830s
2021-12-12 10:45:10 DEBUG (Recorder) [homeassistant.components.recorder.statistics] Compiling statistics for 2021-12-12 09:40:00+00:00-2021-12-12 09:45:00+00:00
2021-12-12 10:45:10 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (MySQLdb._exceptions.OperationalError) (2000, 'Unknown MySQL error')
[SQL: SELECT statistics_meta.id AS statistics_meta_id, statistics_meta.statistic_id AS statistics_meta_statistic_id, statistics_meta.source AS statistics_meta_source, statistics_meta.unit_of_measurement AS statistics_meta_unit_of_measurement, statistics_meta.has_mean AS statistics_meta_has_mean, statistics_meta.has_sum AS statistics_meta_has_sum, statistics_meta.name AS statistics_meta_name 
FROM statistics_meta 
WHERE statistics_meta.statistic_id IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)]
[parameters: ('sensor.power_consumption', 'sensor.power_production', 'sensor.energy_consumption_tarif_1', 'sensor.energy_consumption_tarif_2', 'sensor.energy_production_tarif_1', 'sensor.energy_production_tarif_2', 'sensor.power_consumption_phase_l1', 'sensor.power_consumption_phase_l2', 'sensor.power_consumption_phase_l3', 'sensor.power_production_phase_l1', 'sensor.power_production_phase_l2', 'sensor.power_production_phase_l3', 'sensor.voltage_phase_l1', 'sensor.voltage_phase_l2', 'sensor.voltage_phase_l3', 'sensor.current_phase_l1', 'sensor.current_phase_l2', 'sensor.current_phase_l3', 'sensor.energy_consumption_total', 'sensor.gas_consumption', 'sensor.co2_intensity', 'sensor.grid_fossil_fuel_percentage', 'sensor.energy_production_meter', 'sensor.energy_production_growatt', 'sensor.energy_production_lifetime_growatt', 'sensor.speedtest_ping', 'sensor.speedtest_download', 'sensor.speedtest_upload', 'sensor.water_consumption_year', 'sensor.water_consumption_month', 'sensor.water_consumption_week', 'sensor.water_consumption_daily', 'sensor.energy_consumption_tarif_1_cost', 'sensor.energy_consumption_tarif_2_cost', 'sensor.energy_production_tarif_1_compensation', 'sensor.energy_production_tarif_2_compensation', 'sensor.gas_consumption_cost', 'sensor.power_production_now', 'sensor.water_consumed', 'sensor.water_reading', 'sensor.growatt_total_lifetime_energy_output', 'sensor.sunea_io_rssi_level', 'sensor.6aac12301a_lifetime_energy_output', 'sensor.voordeur_power', 'sensor.voordeur_energy', 'sensor.hal_lamp_power', 'sensor.hal_lamp_energy', 'sensor.hal_device_temperature', 'sensor.wasmachine_power', 'sensor.wasmachine_energy', 'sensor.wasmachine_device_temperature', 'sensor.droger_power', 'sensor.droger_energy', 'sensor.droger_device_temperature', 'sensor.shelly_2_5_hal_power_0', 'sensor.shelly_2_5_hal_energy_0', 'sensor.shelly_2_5_hal_power_1', 'sensor.shelly_2_5_hal_energy_1', 'sensor.shelly_2_5_hal_temperature', 'sensor.dining_3_linkquality', 'sensor.dining_2_linkquality', 'sensor.dining_1_linkquality', 'sensor.door_sensor_4_battery', 'sensor.door_sensor_4_linkquality', 'sensor.door_sensor_4_temperature', 'sensor.door_sensor_4_voltage', 'sensor.door_sensor_3_battery', 'sensor.door_sensor_3_linkquality', 'sensor.door_sensor_3_temperature', 'sensor.door_sensor_3_voltage', 'sensor.door_sensor_2_battery', 'sensor.door_sensor_2_linkquality', 'sensor.door_sensor_2_temperature', 'sensor.door_sensor_2_voltage', 'sensor.keuken_1_linkquality', 'sensor.keuken_4_linkquality', 'sensor.keuken_2_linkquality', 'sensor.keuken_5_linkquality', 'sensor.keuken_3_linkquality', 'sensor.hallway_bulb_linkquality', 'sensor.temp_2_battery', 'sensor.temp_2_temperature', 'sensor.temp_2_humidity', 'sensor.temp_2_pressure', 'sensor.temp_2_linkquality', 'sensor.temp_2_voltage', 'sensor.motion_1_battery', 'sensor.motion_1_linkquality', 'sensor.motion_1_voltage', 'sensor.door_sensor_1_battery', 'sensor.door_sensor_1_linkquality', 'sensor.door_sensor_1_temperature', 'sensor.door_sensor_1_voltage', 'sensor.motion_2_battery', 'sensor.motion_2_linkquality', 'sensor.motion_2_voltage', 'sensor.temp_1_battery', 'sensor.temp_1_temperature', 'sensor.temp_1_humidity', 'sensor.temp_1_pressure', 'sensor.temp_1_linkquality', 'sensor.temp_1_voltage', 'sensor.temp_3_battery', 'sensor.temp_3_temperature', 'sensor.temp_3_humidity', 'sensor.temp_3_pressure', 'sensor.temp_3_linkquality', 'sensor.temp_3_voltage', 'sensor.temp_4_battery', 'sensor.temp_4_temperature', 'sensor.temp_4_humidity', 'sensor.temp_4_pressure', 'sensor.temp_4_linkquality', 'sensor.temp_4_voltage', 'sensor.toilet_1_linkquality', 'sensor.pantry_1_linkquality', 'sensor.ikea_sw1_battery', 'sensor.ikea_sw1_linkquality', 'sensor.ikea_sw2_battery', 'sensor.ikea_sw2_linkquality', 'sensor.livingroom_1_linkquality', 'sensor.vibration_1_battery', 'sensor.vibration_1_linkquality', 'sensor.vibration_1_voltage', 'sensor.temp_5_battery', 'sensor.temp_5_temperature', 'sensor.temp_5_humidity', 'sensor.temp_5_pressure', 'sensor.temp_5_linkquality', 'sensor.temp_5_voltage', 'sensor.temp_6_battery', 'sensor.temp_6_temperature', 'sensor.temp_6_humidity', 'sensor.temp_6_pressure', 'sensor.temp_6_linkquality', 'sensor.temp_6_voltage', 'sensor.motion_3_battery', 'sensor.motion_3_illuminance', 'sensor.motion_3_linkquality', 'sensor.motion_3_illuminance_lux', 'sensor.motion_3_temperature', 'sensor.motion_3_voltage', 'sensor.door_sensor_5_battery', 'sensor.door_sensor_5_linkquality', 'sensor.door_sensor_5_temperature', 'sensor.door_sensor_5_voltage', 'sensor.motion_4_battery', 'sensor.motion_4_temperature', 'sensor.motion_4_illuminance_lux', 'sensor.motion_4_illuminance', 'sensor.motion_4_linkquality', 'sensor.motion_4_voltage', 'sensor.motion_5_battery', 'sensor.motion_5_temperature', 'sensor.motion_5_illuminance_lux', 'sensor.motion_5_illuminance', 'sensor.motion_5_linkquality', 'sensor.motion_5_voltage', 'sensor.power_plug_zolder_power', 'sensor.power_plug_zolder_current', 'sensor.power_plug_zolder_voltage', 'sensor.power_plug_zolder_energy', 'sensor.power_plug_zolder_linkquality', 'sensor.power_plug_tv_meubel_power', 'sensor.power_plug_tv_meubel_current', 'sensor.power_plug_tv_meubel_voltage', 'sensor.power_plug_tv_meubel_energy', 'sensor.power_plug_tv_meubel_linkquality', 'sensor.power_plug_koelkast_waterkoker_power', 'sensor.power_plug_koelkast_waterkoker_current', 'sensor.power_plug_koelkast_waterkoker_voltage', 'sensor.power_plug_koelkast_waterkoker_energy', 'sensor.power_plug_koelkast_waterkoker_linkquality', 'sensor.terras_1_linkquality', 'sensor.shelly_1pm_droger_power_0', 'sensor.shelly_1pm_droger_energy_0', 'sensor.shelly_1pm_droger_temperature', 'sensor.shelly_2_5_hal_voltage', 'sensor.diskstation_cpu_utilization_user', 'sensor.diskstation_cpu_utilization_total', 'sensor.diskstation_memory_usage_real', 'sensor.diskstation_memory_available_swap', 'sensor.diskstation_memory_available_real', 'sensor.diskstation_memory_total_swap', 'sensor.diskstation_memory_total_real', 'sensor.diskstation_network_up', 'sensor.diskstation_network_down', 'sensor.diskstation_volume_1_used_space', 'sensor.diskstation_drive_1_temperature', 'sensor.diskstation_drive_2_temperature', 'sensor.diskstation_temperature')]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
2021-12-12 10:52:48 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 1945
## December 13
2021-12-13 10:15:11 DEBUG (Recorder) [homeassistant.components.recorder.util] querying 1 rows took 0.031456s
2021-12-13 10:15:16 DEBUG (Recorder) [homeassistant.components.recorder] Sending keepalive
2021-12-13 10:19:46 DEBUG (Recorder) [homeassistant.components.recorder] Sending keepalive
2021-12-13 10:20:10 DEBUG (Recorder) [homeassistant.components.recorder.statistics] Compiling statistics for 2021-12-13 09:15:00+00:00-2021-12-13 09:20:00+00:00
2021-12-13 10:20:10 ERROR (Recorder) [homeassistant.components.recorder.util] Error executing query: (MySQLdb._exceptions.OperationalError) (2000, 'Unknown MySQL error')
[SQL: SELECT statistics_meta.id AS statistics_meta_id, statistics_meta.statistic_id AS statistics_meta_statistic_id, statistics_meta.source AS statistics_meta_source, statistics_meta.unit_of_measurement AS statistics_meta_unit_of_measurement, statistics_meta.has_mean AS statistics_meta_has_mean, statistics_meta.has_sum AS statistics_meta_has_sum, statistics_meta.name AS statistics_meta_name 
FROM statistics_meta 
WHERE statistics_meta.statistic_id IN (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)]
[parameters: ('sensor.power_consumption', 'sensor.power_production', 'sensor.energy_consumption_tarif_1', 'sensor.energy_consumption_tarif_2', 'sensor.energy_production_tarif_1', 'sensor.energy_production_tarif_2', 'sensor.power_consumption_phase_l1', 'sensor.power_consumption_phase_l2', 'sensor.power_consumption_phase_l3', 'sensor.power_production_phase_l1', 'sensor.power_production_phase_l2', 'sensor.power_production_phase_l3', 'sensor.voltage_phase_l1', 'sensor.voltage_phase_l2', 'sensor.voltage_phase_l3', 'sensor.current_phase_l1', 'sensor.current_phase_l2', 'sensor.current_phase_l3', 'sensor.energy_consumption_total', 'sensor.gas_consumption', 'sensor.co2_intensity', 'sensor.grid_fossil_fuel_percentage', 'sensor.energy_production_meter', 'sensor.energy_production_growatt', 'sensor.energy_production_lifetime_growatt', 'sensor.speedtest_ping', 'sensor.speedtest_download', 'sensor.speedtest_upload', 'sensor.water_consumption_year', 'sensor.water_consumption_month', 'sensor.water_consumption_week', 'sensor.water_consumption_daily', 'sensor.energy_consumption_tarif_1_cost', 'sensor.energy_consumption_tarif_2_cost', 'sensor.energy_production_tarif_1_compensation', 'sensor.energy_production_tarif_2_compensation', 'sensor.gas_consumption_cost', 'sensor.power_production_now', 'sensor.water_consumed', 'sensor.water_reading', 'sensor.sunea_io_rssi_level', 'sensor.voordeur_power', 'sensor.voordeur_energy', 'sensor.hal_lamp_power', 'sensor.hal_lamp_energy', 'sensor.hal_device_temperature', 'sensor.wasmachine_power', 'sensor.wasmachine_energy', 'sensor.wasmachine_device_temperature', 'sensor.droger_power', 'sensor.droger_energy', 'sensor.droger_device_temperature', 'sensor.shelly_2_5_hal_power_0', 'sensor.shelly_2_5_hal_energy_0', 'sensor.shelly_2_5_hal_power_1', 'sensor.shelly_2_5_hal_energy_1', 'sensor.shelly_2_5_hal_temperature', 'sensor.dining_3_linkquality', 'sensor.dining_2_linkquality', 'sensor.dining_1_linkquality', 'sensor.door_sensor_4_battery', 'sensor.door_sensor_4_linkquality', 'sensor.door_sensor_4_temperature', 'sensor.door_sensor_4_voltage', 'sensor.door_sensor_3_battery', 'sensor.door_sensor_3_linkquality', 'sensor.door_sensor_3_temperature', 'sensor.door_sensor_3_voltage', 'sensor.door_sensor_2_battery', 'sensor.door_sensor_2_linkquality', 'sensor.door_sensor_2_temperature', 'sensor.door_sensor_2_voltage', 'sensor.keuken_1_linkquality', 'sensor.keuken_4_linkquality', 'sensor.keuken_2_linkquality', 'sensor.keuken_5_linkquality', 'sensor.keuken_3_linkquality', 'sensor.hallway_bulb_linkquality', 'sensor.temp_2_battery', 'sensor.temp_2_temperature', 'sensor.temp_2_humidity', 'sensor.temp_2_pressure', 'sensor.temp_2_linkquality', 'sensor.temp_2_voltage', 'sensor.motion_1_battery', 'sensor.motion_1_linkquality', 'sensor.motion_1_voltage', 'sensor.door_sensor_1_battery', 'sensor.door_sensor_1_linkquality', 'sensor.door_sensor_1_temperature', 'sensor.door_sensor_1_voltage', 'sensor.motion_2_battery', 'sensor.motion_2_linkquality', 'sensor.motion_2_voltage', 'sensor.temp_1_battery', 'sensor.temp_1_temperature', 'sensor.temp_1_humidity', 'sensor.temp_1_pressure', 'sensor.temp_1_linkquality', 'sensor.temp_1_voltage', 'sensor.temp_3_battery', 'sensor.temp_3_temperature', 'sensor.temp_3_humidity', 'sensor.temp_3_pressure', 'sensor.temp_3_linkquality', 'sensor.temp_3_voltage', 'sensor.temp_4_battery', 'sensor.temp_4_temperature', 'sensor.temp_4_humidity', 'sensor.temp_4_pressure', 'sensor.temp_4_linkquality', 'sensor.temp_4_voltage', 'sensor.toilet_1_linkquality', 'sensor.pantry_1_linkquality', 'sensor.ikea_sw1_battery', 'sensor.ikea_sw1_linkquality', 'sensor.ikea_sw2_battery', 'sensor.ikea_sw2_linkquality', 'sensor.livingroom_1_linkquality', 'sensor.vibration_1_battery', 'sensor.vibration_1_linkquality', 'sensor.vibration_1_voltage', 'sensor.temp_5_battery', 'sensor.temp_5_temperature', 'sensor.temp_5_humidity', 'sensor.temp_5_pressure', 'sensor.temp_5_linkquality', 'sensor.temp_5_voltage', 'sensor.temp_6_battery', 'sensor.temp_6_temperature', 'sensor.temp_6_humidity', 'sensor.temp_6_pressure', 'sensor.temp_6_linkquality', 'sensor.temp_6_voltage', 'sensor.motion_3_battery', 'sensor.motion_3_illuminance', 'sensor.motion_3_linkquality', 'sensor.motion_3_illuminance_lux', 'sensor.motion_3_temperature', 'sensor.motion_3_voltage', 'sensor.door_sensor_5_battery', 'sensor.door_sensor_5_linkquality', 'sensor.door_sensor_5_temperature', 'sensor.door_sensor_5_voltage', 'sensor.motion_4_battery', 'sensor.motion_4_temperature', 'sensor.motion_4_illuminance_lux', 'sensor.motion_4_illuminance', 'sensor.motion_4_linkquality', 'sensor.motion_4_voltage', 'sensor.motion_5_battery', 'sensor.motion_5_temperature', 'sensor.motion_5_illuminance_lux', 'sensor.motion_5_illuminance', 'sensor.motion_5_linkquality', 'sensor.motion_5_voltage', 'sensor.power_plug_zolder_power', 'sensor.power_plug_zolder_current', 'sensor.power_plug_zolder_voltage', 'sensor.power_plug_zolder_energy', 'sensor.power_plug_zolder_linkquality', 'sensor.power_plug_tv_meubel_power', 'sensor.power_plug_tv_meubel_current', 'sensor.power_plug_tv_meubel_voltage', 'sensor.power_plug_tv_meubel_energy', 'sensor.power_plug_tv_meubel_linkquality', 'sensor.power_plug_koelkast_waterkoker_power', 'sensor.power_plug_koelkast_waterkoker_current', 'sensor.power_plug_koelkast_waterkoker_voltage', 'sensor.power_plug_koelkast_waterkoker_energy', 'sensor.power_plug_koelkast_waterkoker_linkquality', 'sensor.terras_1_linkquality', 'sensor.shelly_1pm_droger_power_0', 'sensor.shelly_1pm_droger_energy_0', 'sensor.shelly_1pm_droger_temperature', 'sensor.shelly_2_5_hal_voltage', 'sensor.diskstation_cpu_utilization_user', 'sensor.diskstation_cpu_utilization_total', 'sensor.diskstation_memory_usage_real', 'sensor.diskstation_memory_available_swap', 'sensor.diskstation_memory_available_real', 'sensor.diskstation_memory_total_swap', 'sensor.diskstation_memory_total_real', 'sensor.diskstation_network_up', 'sensor.diskstation_network_down', 'sensor.diskstation_volume_1_used_space', 'sensor.diskstation_drive_1_temperature', 'sensor.diskstation_drive_2_temperature', 'sensor.diskstation_temperature', 'sensor.growatt_total_lifetime_energy_output', 'sensor.6aac12301a_lifetime_energy_output')]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
2021-12-13 10:22:28 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 636
2021-12-13 10:30:00 DEBUG (SyncWorker_12) [homeassistant.components.recorder.util] querying 1 rows took 0.008920s
2021-12-13 10:30:00 DEBUG (SyncWorker_12) [homeassistant.components.recorder.history] get_significant_states took 0.009107s
2021-12-13 10:30:00 DEBUG (SyncWorker_11) [homeassistant.components.recorder.util] querying 1 rows took 0.007810s
alekslyse commented 2 years ago

I have the exact same issue on Home Assistant OS, so the built in mariadb addon. At 4AM the database does something, and after that it goes into recorder limit

[04:03:00] INFO: Lock tables using mariadb client... [04:03:00] INFO: MariaDB tables locked. [04:33:51] INFO: MariaDB tables unlocked." This is exactly around 15 around when all logging stops

Then the log spits out: The recorder queue reached the maximum size of 30000; Events are no longer being recorded

jstock524 commented 2 years ago

My database is also reaching 30000 then shutting. My energy management shuts stops recording around 4am for the last month like clock work!

ShadowRep commented 2 years ago

I did try to remove MariaDB and it did not do anything. The error persisted. Therefore I do not think that MariaDB was the problem.

But afterwards I modified/deactivated several of my automations (some had a while loop) and replaced them withNode Red flows.

The error is completely gone now! No more problems for me with this issue.

I have not reactivated MariaDB, yet.

I hope this is helpful information.

jstock524 commented 2 years ago

I remove all my HA automations and that did the trick for me as well. I never disabled MariaDB so it must be in the built in automation setup. I usually use Node Red but I had been testing the new Blueprints and was in that timeline of having this issue.

ShadowRep commented 2 years ago

Can we ask someone from the core team (I think they would be responsible for the automations system as well) about this? Does anyone know how to do that?

BlackSucat commented 2 years ago

I started with diving into HA only for a couple of months now, but I am experiencing exactly the same error for weeks now. My setup is still very simple with only a couple of sensors and a couple of automations. The recorder database is also MariaDB and I am using a combination of InfluxDB/Grafana as well. I am running HA Supervised on a VM installed on a Synology NAS.

I played a bit with excluding some entities of the one sensor that creates a lot of data, which is the SlimmeMeter+:

image

But to no avail....

One weird thing I noticed (or think I noticed), is that it looks like the issue is less frequently happening if I do not open the HA interface on my browser after a restart. Two weeks ago I was out of the country for a week, with no issues. Upon arrivival home and opening the interface, soon after that the error occured again. This all might be a coincidence, but just wanted to mention it.

Even after reading all the comments above, I have really no idea what to do. I did not yet try removing the few automations I have. I just wonder if there is a way to find out what these 30,000 records are made up in the database. Is there an easy way to find out?

rostislav-palivoda commented 2 years ago

Here is how I understand this situation - 30k limit is for the queue in RAM (not in the database). If you hit the recorder limit then your server has not enough power to write all incoming values in to DB. There may be different reasons for this - CPU usage by other apps, disk speed, DB configuration but I would start with answering to the question - do I really need all this values with such frequency. By reducing frequency of values write in to DB you can overcome this limit.

Most simple would be to make an aggregation SQL query to see how much values you received from each sensor during last hour:

SELECT COUNT(state_id) as Count, entity_id 
FROM "states" 
WHERE "created" >=datetime('now', '-1 Hour') 
GROUP BY entity_id 
ORDER BY Count DESC
BlackSucat commented 2 years ago

Interesting to learn that the issue is related to the queue in RAM. What I am struggling with is to understand why my Synology NAS (DS 920+) would not be able to write all incoming values into the database. I do not see a lot of stress on both CPU and memory load (I even increased the memory to allow for the VM to run without issues), similar to what others above describe in similar situations above.

The only sensor that I have that is creating a lot of data, is the SlimmeLezer+ from Marcel Zuidwijk. More than happy to see if I can limit the frequency in which the sensor provides data, but I could not yet find a setting for this. I will contact Marcel to see if this is possible.

Thanks for now for the valuable feedback.

rostislav-palivoda commented 2 years ago

If sysinfo looks ok then probably next place to look for bottleneck is DB or Python diagnostics. Alternatively you can increase the limit in Python code from 30k to e.g. 60k and see what will happen with sysinfo.

BlackSucat commented 2 years ago

My problem seems to be resolved by simply increasing the commit_interval from the default 1 second to 5 seconds. In the past week I have not seen the error from occuring anymore. Thanks Rostislav for your support!

egranto commented 2 years ago

yours but i still had mine even on the new release

Logger: homeassistant.components.recorder.migration Source: components/recorder/migration.py:75 Integration: Recorder (documentation, issues) First occurred: 1:10:38 PM (1 occurrences) Last logged: 1:10:38 PM

Database is about to upgrade. Schema version: 24

Logger: homeassistant.components.recorder.migration Source: components/recorder/migration.py:199 Integration: Recorder (documentation, issues) First occurred: 1:10:38 PM (1 occurrences) Last logged: 1:10:38 PM

Adding columns attributes_id to table states. Note: this can take several minutes on large databases and slow computers. Please be patient!

Logger: homeassistant.components.recorder.migration Source: components/recorder/migration.py:101 Integration: Recorder (documentation, issues) First occurred: 1:12:10 PM (1 occurrences) Last logged: 1:12:10 PM

Adding index ix_states_attributes_id to database. Note: this can take several minutes on large databases and slow computers. Please be patient!

Logger: homeassistant.components.recorder Source: components/recorder/init.py:676 Integration: Recorder (documentation, issues) First occurred: 1:29:44 PM (2 occurrences) Last logged: 2:19:41 PM

The recorder queue reached the maximum size of 30000; Events are no longer being recorded

bdraco commented 2 years ago

@egranto How many state_changed events are firing per minute? You can enable debug logging on homeassistant.core to see

egranto commented 2 years ago

@egranto How many state_changed events are firing per minute? You can enable debug logging on homeassistant.core to see alot i had lots of sensors but let me activate the debug and restart my system, i some command line for my Foscam motion sensor every 10 seconds, (i used to run it in 5) are 5 camaras

i'm even filtering all my ESP-homes to send just every 5 min

egranto commented 2 years ago

image pivot table searching "event state_change"

image Performance of the whole nas

performance just the DB image

and homeassistant memory image

bdraco commented 2 years ago

Can you provide a py-spy of the Home Assistant process so we can see what the recorder thread is doing? https://community.home-assistant.io/t/home-assistant-high-memory-usage/180556/60?u=bdraco https://community.home-assistant.io/t/help-with-installing-py-spy-to-identify-process-pushing-the-cpu-to-100-24-7/248054/5?u=bdraco. (may need to adjust python version to 3.9)

On synology

root@mauinasa:/volume1/docker/89-homeassistant# docker exec -it 89-homeassistant /bin/bash
bash-5.1# py-spy  record --pid 214 --output /config/www/opt99.svg --duration 120

Something like this will do it

egranto commented 2 years ago

my HA runs on synology VM, and i got an error trying to install py-spy


Collecting py-spy
  Using cached py_spy-0.3.11.tar.gz (155 kB)
  Installing build dependencies ... error
  error: subprocess-exited-with-error

  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [59 lines of output]
      Collecting maturin<0.12,>=0.11
        Using cached maturin-0.11.5.tar.gz (469 kB)
        Installing build dependencies: started
        Installing build dependencies: finished with status 'done'
        Getting requirements to build wheel: started
        Getting requirements to build wheel: finished with status 'done'
        Preparing metadata (pyproject.toml): started
        Preparing metadata (pyproject.toml): finished with status 'done'
      Collecting toml~=0.10.0
        Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
      Building wheels for collected packages: maturin
        Building wheel for maturin (pyproject.toml): started
        Building wheel for maturin (pyproject.toml): finished with status 'error'
        error: subprocess-exited-with-error

        × Building wheel for maturin (pyproject.toml) did not run successfully.
        │ exit code: 1
        ╰─> [35 lines of output]
            running bdist_wheel
            running build
            running install
            Traceback (most recent call last):
              File "/tmp/tmpiq24um9t_in_process.py", line 363, in <module>
                main()
              File "/tmp/tmpiq24um9t_in_process.py", line 345, in main
                json_out['return_val'] = hook(**hook_input['kwargs'])
              File "/tmp/tmpiq24um9t_in_process.py", line 261, in build_wheel
                return _build_backend().build_wheel(wheel_directory, config_settings,
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 216, in build_wheel
                return self._build_with_temp_dir(['bdist_wheel'], '.whl',
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 202, in _build_with_temp_dir
                self.run_setup()
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 145, in run_setup
                exec(compile(code, __file__, 'exec'), locals())
              File "setup.py", line 106, in <module>
                setup(
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 153, in setup
                return distutils.core.setup(**attrs)
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 148, in setup
                dist.run_commands()
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
                self.run_command(cmd)
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
                cmd_obj.run()
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/wheel/bdist_wheel.py", line 335, in run
                self.run_command('install')
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
                self.distribution.run_command(command)
              File "/tmp/pip-build-env-oc0spltf/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
                cmd_obj.run()
              File "setup.py", line 58, in run
                raise RuntimeError(
            RuntimeError: cargo not found in PATH. Please install rust (https://www.rust-lang.org/tools/install) and try again
            [end of output]

        note: This error originates from a subprocess, and is likely not a problem with pip.
        ERROR: Failed building wheel for maturin
      Failed to build maturin
      ERROR: Could not build wheels for maturin, which is required to install pyproject.toml-based projects
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.`
bdraco commented 2 years ago

This will likely help https://github.com/home-assistant/core/pull/69628

bdraco commented 2 years ago

my HA runs on synology VM, and i got an error trying to install py-spy

https://github.com/benfred/py-spy/releases/tag/v0.3.11

You can download one of the .whl files from here and unzip it to get the binary

egranto commented 2 years ago

i don't know how there is a guide i could follow?

bdraco commented 2 years ago

https://community.home-assistant.io/t/help-with-installing-py-spy-to-identify-process-pushing-the-cpu-to-100-24-7/248054/5

In your synology you can get into the docker instance with docker exec -it homeassistant /bin/bash

wget https://github.com/benfred/py-spy/releases/download/v0.3.11/py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl

Example

root@mauinasa:/volume1/docker/89-homeassistant# docker exec -it 89-homeassistant /bin/bash 
bash-5.1# wget https://github.com/benfred/py-spy/releases/download/v0.3.11/py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Connecting to github.com (192.30.255.112:443)
Connecting to objects.githubusercontent.com (185.199.108.133:443)
saving to 'py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl'
py_spy-0.3.11-py2.py 100% |*****************************************************************************************************************************************************************************************************************************************************************************************************| 2913k  0:00:00 ETA
'py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl' saved
bash-5.1# unzip py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl 
Archive:  py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl
  inflating: py_spy-0.3.11.dist-info/METADATA
  inflating: py_spy-0.3.11.dist-info/WHEEL
  inflating: py_spy-0.3.11.data/scripts/py-spy
  inflating: py_spy-0.3.11.dist-info/RECORD
bash-5.1# ./py_spy-0.3.11
py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl  py_spy-0.3.11.data/                                                    py_spy-0.3.11.dist-info/                                               
bash-5.1# ./py_spy-0.3.11.data/scripts/py-spy 
py-spy 0.3.11
Sampling profiler for Python programs 

USAGE:
    py-spy <SUBCOMMAND>

OPTIONS:
    -h, --help       Prints help information
    -V, --version    Prints version information

SUBCOMMANDS:
    record    Records stack trace information to a flamegraph, speedscope or raw file
    top       Displays a top like view of functions consuming CPU
    dump      Dumps stack traces for a target program to stdout
    help      Prints this message or the help of the given subcommand(s)
egranto commented 2 years ago

Thanks that was very helpful here is my output of the py_spy https://1drv.ms/u/s!AhgrhZEh5e5oiLQCvekKoSXXc0mRAA

i do get this not sure if is important image

bdraco commented 2 years ago

Looks like the top usage is

  1. Stream worker (may include idle time so could actually be lower)
  2. Compiling statistics
  3. Speed test dot not
  4. History stats sensors
bdraco commented 2 years ago

If we can get rid of this flush it looks like it will help quite a bit https://github.com/home-assistant/core/blob/dev/homeassistant/components/recorder/statistics.py#L246

We might be able to do something like this to avoid the flush https://github.com/home-assistant/core/blob/dev/homeassistant/components/recorder/__init__.py#L1210 via https://github.com/home-assistant/core/blob/dev/homeassistant/components/recorder/models.py#L175

cc @emontnemery

bdraco commented 2 years ago

I might be reading the line numbers wrong as I was checking against dev and things have changed a bit in the last 48 hours

Which version are you running?

egranto commented 2 years ago

i had 5 foscam's using the homeassitant integration if that helps with the stream worker load

i could remove the speedtest because i'm also getting that value from Unifi

well i was going to said latest but an update just appear image

bdraco commented 2 years ago

Core CPU being 79.5% is quite abnormally high. Can you try temporarily disabling the cameras and see if the cpu goes down to something < 10%?

If you can get rid of speed test, I'd definitely recommend that as it will slow down the whole install every time it runs.

egranto commented 2 years ago

speedtest and camaras disable and core updated image

should we open a case over the foscam integration i can live without the speedtest, but my doorbells are set with AI in homeassistant and deepstack

Update ... is even getting lower image

egranto commented 2 years ago

i run system monitor this is what i could get from the last 30 days image

feature request ... we need the supervisor/core sensors for cpu and ram usage

egranto commented 2 years ago
Logger: homeassistant.components.recorder
Source: components/recorder/__init__.py:676
Integration: Recorder (documentation, issues)
First occurred: 5:21:04 PM (1 occurrences)
Last logged: 5:21:04 PM

The recorder queue reached the maximum size of 30000; Events are no longer being recorded

image disabling the camaras and speedtest didn't make the trick

bdraco commented 2 years ago

Can you get a dump of which queries are running on your mysql server when the recorder memory buffer overflows?

bdraco commented 2 years ago
Screen_Shot_2022-04-08_at_21_15_15

It looks like these two queries are run in a loop and take up the most time.

We might be able to optimize it in these two places by combine each loop into a single query for all the entity_id and then extract the result

  1. https://github.com/home-assistant/core/blob/dev/homeassistant/components/sensor/recorder.py#L514

            last_stats = statistics.get_last_short_term_statistics(
                hass, 1, entity_id, False
            )
  2. https://github.com/home-assistant/core/blob/dev/homeassistant/components/recorder/statistics.py#L240

    old_metadata_dict = get_metadata_with_session(
        hass, session, statistic_ids=[statistic_id]
    )
egranto commented 2 years ago

Can you get a dump of which queries are running on your mysql server when the recorder memory buffer overflows?

i don't had logging enable in the DB

egranto commented 2 years ago

Yesterday was worst, first i had a power outage (Blue marks), and my influxDb wasn't in auto start mode (runs on Docker not HA) image

image but it seems worst because is only collecting for a few hours

another py_spy output HA logs with debugger in recorder on MariaDB logs
https://1drv.ms/u/s!AhgrhZEh5e5oiLQGL4qIP--pfujWLQ?e=ehbxze i'm sharing the whole folder let me know if that works

Today i turn on back my camaras because seems that wasn't the issue

egranto commented 2 years ago

https://snapshots.raintank.io/dashboard/snapshot/7zHDf7OdQNfio1SGbcMpPRNXCzFT5sdw

I create a new db thinking or mine was to big or got corrupted some how, add telegraf plugin for my MariaDB

image i think those not index and lots of sorts could be one of the issues

image

in the slow queries it seems is trying to execute the same slow query twice

# Time: 220414  8:30:16
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45895  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 14.402848  Lock_time: 0.000224  Rows_sent: 20  Rows_examined: 2894
# Rows_affected: 0  Bytes_sent: 8139
SET timestamp=1649946616;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.585920' AND states.entity_id = 'sensor.alita_ph' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45896  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 14.413257  Lock_time: 0.000221  Rows_sent: 20  Rows_examined: 2894
# Rows_affected: 0  Bytes_sent: 8139
SET timestamp=1649946616;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.582386' AND states.entity_id = 'sensor.alita_ph' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# Time: 220414  8:30:17
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45875  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 14.949043  Lock_time: 0.000336  Rows_sent: 20  Rows_examined: 4415
# Rows_affected: 0  Bytes_sent: 8516
SET timestamp=1649946617;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.603907' AND states.entity_id = 'sensor.alita_tds' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45897  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 14.894783  Lock_time: 0.001365  Rows_sent: 20  Rows_examined: 4415
# Rows_affected: 0  Bytes_sent: 8516
SET timestamp=1649946617;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.666013' AND states.entity_id = 'sensor.alita_tds' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# Time: 220414  8:30:33
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45896  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 15.984333  Lock_time: 0.000223  Rows_sent: 20  Rows_examined: 27659
# Rows_affected: 0  Bytes_sent: 9526
SET timestamp=1649946633;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:17.043295' AND states.entity_id = 'sensor.alita_temperatura_agua' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 45895  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 16.085556  Lock_time: 0.000239  Rows_sent: 20  Rows_examined: 27659
# Rows_affected: 0  Bytes_sent: 9526
SET timestamp=1649946633;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:17.032725' AND states.entity_id = 'sensor.alita_temperatura_agua' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# Time: 220414 10:26:02
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 47273  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 10.468085  Lock_time: 0.000167  Rows_sent: 20  Rows_examined: 2965
# Rows_affected: 0  Bytes_sent: 8199
SET timestamp=1649953562;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 16:25:51.983359' AND states.entity_id = 'sensor.alita_ph' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 47285  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 10.456061  Lock_time: 0.000304  Rows_sent: 20  Rows_examined: 2965
# Rows_affected: 0  Bytes_sent: 8199
SET timestamp=1649953562;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 16:25:51.997580' AND states.entity_id = 'sensor.alita_ph' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# Time: 220414 10:26:12
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 47287  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 10.270182  Lock_time: 0.000236  Rows_sent: 20  Rows_examined: 28157
# Rows_affected: 0  Bytes_sent: 9765
SET timestamp=1649953572;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 16:26:01.979383' AND states.entity_id = 'sensor.alita_temperatura_agua' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
# User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113]
# Thread_id: 47286  Schema: HomeAssistant2022_4_2  QC_hit: No
# Query_time: 10.299549  Lock_time: 0.000228  Rows_sent: 20  Rows_examined: 28157
# Rows_affected: 0  Bytes_sent: 9765
SET timestamp=1649953572;
SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 16:26:01.964595' AND states.entity_id = 'sensor.alita_temperatura_agua' ORDER BY states.entity_id, states.last_updated DESC 
 LIMIT 20;
 2022-04-14 10:15:28 DEBUG (Recorder) [homeassistant.components.recorder] Connected to recorder database
2022-04-14 10:15:28 DEBUG (Recorder) [homeassistant.components.recorder] Compiling missing statistics for 2022-04-14 16:10:00+00:00-2022-04-14 16:15:00+00:00

....
2022-04-14 10:35:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.util] querying 0 rows took 0.034111s
2022-04-14 10:35:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.history] get_significant_states took 0.035457s
2022-04-14 10:35:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.000005s
2022-04-14 10:35:28 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 14048
2022-04-14 10:35:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.util] querying 0 rows took 0.076766s
2022-04-14 10:35:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.history] get_significant_states took 0.077659s
.....

2022-04-14 11:05:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.util] querying 0 rows took 0.003971s
2022-04-14 11:05:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.history] get_significant_states took 0.004517s
2022-04-14 11:05:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.000007s
2022-04-14 11:05:28 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 29891
2022-04-14 11:05:28 DEBUG (DbWorker_3) [homeassistant.components.recorder.util] querying 0 rows took 0.006454s

....
2022-04-14 11:15:25 DEBUG (DbWorker_0) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.000003s
2022-04-14 11:15:25 DEBUG (DbWorker_2) [homeassistant.components.recorder.util] querying 1 rows took 0.020831s
2022-04-14 11:15:25 DEBUG (DbWorker_2) [homeassistant.components.recorder.history] get_significant_states took 0.021567s
2022-04-14 11:15:25 DEBUG (DbWorker_2) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.000004s
2022-04-14 11:15:28 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 34966
2022-04-14 11:15:28 ERROR (MainThread) [homeassistant.components.recorder] The recorder queue reached the maximum size of 30000; Events are no longer being recorded
2022-04-14 11:15:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.util] querying 0 rows took 0.707786s
2022-04-14 11:15:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.history] get_significant_states took 0.708419s
2022-04-14 11:15:28 DEBUG (DbWorker_0) [homeassistant.components.recorder.history] getting 1 first datapoints took 0.000026s
2022-04-14 11:15:28 DEBUG (DbWorker_2) [homeassistant.components.recorder.util] querying 1 rows took 0.558484s

image

Last time only runs for 1hour and then we reach the queue

bdraco commented 2 years ago

We don't do LIMIT 20 anywhere in core code AFAICT. Is this coming from a custom integration or frontend card?

Edit: it looks like statistics sensors have a DEFAULT_BUFFER_SIZE = 20

bdraco commented 2 years ago

1drv.ms/u/s!AhgrhZEh5e5oiLQGL4qIP--pfujWLQ?e=ehbxze i'm sharing the whole folder let me know if that works

For the py-spy%20(1).svg file, The linked PR (#69731) should make compiling statistics roughly 80-100x faster which is the primary database usage in that recording. This seems like its a major contributing factor as it does block state inserts from happening which means the queue of pending events to process grows.

bdraco commented 2 years ago

It looks like you have history_stats and statistics sensors that are also contributing to the load. I'm not sure if we can do anything to further optimize them as the underlying code is already heavily optimized.

egranto commented 2 years ago

i'm checking and most of the entity listed are the ones i use on statistics

image

image

bdraco commented 2 years ago

Query:

User@Host: homeassistant[homeassistant] @ homeassistant.grantotarola.ddns.net [192.168.0.113] Thread_id: 45896 Schema: HomeAssistant2022_4_2 QC_hit: No Query_time: 14.413257 Lock_time: 0.000221 Rows_sent: 20 Rows_examined: 2894 Rows_affected: 0 Bytes_sent: 8139 SET timestamp=1649946616; SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.582386' AND states.entity_id = 'sensor.alita_ph' ORDER BY states.entity_id, states.last_updated DESC LIMIT 20;

Something like this should be a VERY fast query since its almost fully covered by the index. It definitely shouldn't take 14.413257s

sqlite> explain query plan SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
   ...> FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
   ...> WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.582386' AND states.entity_id = 'sensor.memory_free' ORDER BY states.entity_id, states.last_updated DESC 
   ...>  LIMIT 20;
0|0|0|SEARCH TABLE states USING INDEX ix_states_entity_id_last_updated (entity_id=? AND last_updated>?)
0|1|1|SEARCH TABLE state_attributes USING INTEGER PRIMARY KEY (rowid=?)
sqlite> SELECT states.entity_id AS states_entity_id, states.state AS states_state, states.last_changed AS states_last_changed, states.last_updated AS states_last_updated, states.attributes AS states_attributes, state_attributes.shared_attrs AS state_attributes_shared_attrs 
   ...> FROM states LEFT OUTER JOIN state_attributes ON states.attributes_id = state_attributes.attributes_id 
   ...> WHERE states.last_changed = states.last_updated AND states.last_updated > '2022-04-07 14:30:02.582386' AND states.entity_id = 'sensor.memory_free' ORDER BY states.entity_id, states.last_updated DESC 
   ...>  LIMIT 20;
sensor.memory_free|17336.2|2022-04-14 18:42:02.085292|2022-04-14 18:42:02.085292||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17335.9|2022-04-14 18:41:47.083737|2022-04-14 18:41:47.083737||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17327.8|2022-04-14 18:41:32.083659|2022-04-14 18:41:32.083659||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17334.2|2022-04-14 18:41:17.081795|2022-04-14 18:41:17.081795||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17340.4|2022-04-14 18:41:02.080648|2022-04-14 18:41:02.080648||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17338.1|2022-04-14 18:40:47.080101|2022-04-14 18:40:47.080101||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17346.7|2022-04-14 18:40:32.079091|2022-04-14 18:40:32.079091||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17344.5|2022-04-14 18:40:17.078156|2022-04-14 18:40:17.078156||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17344.1|2022-04-14 18:40:02.076039|2022-04-14 18:40:02.076039||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17341.4|2022-04-14 18:39:47.074487|2022-04-14 18:39:47.074487||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17343.1|2022-04-14 18:39:32.072883|2022-04-14 18:39:32.072883||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17346.0|2022-04-14 18:39:17.073227|2022-04-14 18:39:17.073227||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17344.0|2022-04-14 18:39:02.071094|2022-04-14 18:39:02.071094||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17347.5|2022-04-14 18:38:47.070083|2022-04-14 18:38:47.070083||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17347.4|2022-04-14 18:38:32.070160|2022-04-14 18:38:32.070160||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17340.0|2022-04-14 18:38:17.068849|2022-04-14 18:38:17.068849||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17347.1|2022-04-14 18:38:02.068172|2022-04-14 18:38:02.068172||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17343.5|2022-04-14 18:37:32.066484|2022-04-14 18:37:32.066484||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17340.3|2022-04-14 18:37:17.064507|2022-04-14 18:37:17.064507||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
sensor.memory_free|17346.8|2022-04-14 18:37:02.063351|2022-04-14 18:37:02.063351||{"state_class":"measurement","unit_of_measurement":"MiB","icon":"mdi:memory","friendly_name":"Memory free"}
Run Time: real 0.001 user 0.000000 sys 0.000000
sqlite> 
bdraco commented 2 years ago

How many states do you have for sensor.alita_ph?

select count(*) from states where entity_id='sensor.alita_ph' ;

egranto commented 2 years ago

Again 1 hour

2022-04-14 11:45:44 DEBUG (Recorder) [homeassistant.components.recorder] Connected to recorder database
....

2022-04-14 12:35:45 DEBUG (MainThread) [homeassistant.components.recorder] Recorder queue size is: 33331
2022-04-14 12:35:45 ERROR (MainThread) [homeassistant.components.recorder] The recorder queue reached the maximum size of 30000; Events are no longer being recorded

image

bdraco commented 2 years ago

We can likely move it up to 40k in memory before it's a problem but in practice it shouldn't get that high unless disk I/o or cpu is constrained

egranto commented 2 years ago

my synology use M2 Cache on the disk i had the DB the synology OS use a SSD and had a total of 16gb of ram

the VM had 3 cores and 6gb of ram, i just disable all my statistics and history sensors, maybe i had to rebuild all with helpers, i been having those sensors for more than a year around 2 if i'm not mistaken

bdraco commented 2 years ago

Are you sure your disk array is healthy? We have seen similar issues when one of the disks is starting to go.

egranto commented 2 years ago

image

2022-04-14 12:50:49 DEBUG (Recorder) [homeassistant.components.recorder] Connected to recorder database
....
2022-04-14 14:00:50 ERROR (MainThread) [homeassistant.components.recorder] The recorder queue reached the maximum size of 30000; Events are no longer being recorded

without statistics or history sensors

bdraco commented 2 years ago

Let's wait for https://github.com/home-assistant/core/pull/69731 to get published and see if it solves it for you. I'm pretty confident it will make the situation much better

Once it does, if you can provide dumps like you did in https://github.com/home-assistant/core/issues/56987#issuecomment-1094265142 we can do another optimization pass if there are still areas that need improvement.

egranto commented 2 years ago

ok at soon as i update i'll pass all the info i can, thanks for looking into

bdraco commented 2 years ago

Let's keep this open for now so we can look at other optimization opportunities once it's published

bdraco commented 2 years ago

I think we can reduce the number of queries in history stats as well if the state change event is in the window and we have a previous history list from the window