Closed ic-dev21 closed 2 months ago
Install the profiler integration and use the profiler.start
service to generate a callgrind
https://www.home-assistant.io/integrations/profiler/#service-profilerstart
Post the callgrind here and we can take a look
# Example configuration.yaml entry
logger:
default: info
logs:
homeassistant.core: debug
You might want to enable debug logging for core as well as it will show you where the massive amount of events you see in your log entry are coming from.
Install the profiler integration and use the
profiler.start
service to generate a callgrindhttps://www.home-assistant.io/integrations/profiler/#service-profilerstart
Post the callgrind here and we can take a look
I will have to try this later today as I'm not home to test. I did notice when updating to 2024.4.1 that a mushroom card (HACS) would constantly timeout or reload. I removed this integration and went back to the 2024.4.1 image. If I start racking up CPU cycles again I'll let you know tonight.
# Example configuration.yaml entry logger: default: info logs: homeassistant.core: debug
You might want to enable debug logging for core as well as it will show you where the massive amount of events you see in your log entry are coming from.
My install locked up again at 100% CPU after an hour.
This release is broken. Reverting fixes it.
Adding debug logging for core and profiler.
I run three instances (2 dev for a custom component and 1 prod). Only my prod experiences this. My main dev is on 2024.4.1 with a fully vanilla instance except for the custom component I maintain and doesn't experience this.
My prod has my whole house with 100% native HA components + my custom component.
@bdraco when do I need to run profiler? When CPU is high or anytime?
Generally you want to do it with both cases so you have a baseline to compare against
Generally you want to do it with both cases so you have a baseline to compare against
Profiler is running now. Baseline below Archive.zip
Baseline log home-assistant.log
Both of those look OK which is a good baseline to compare with when the problem does happen
Both of those look OK which is a good baseline to compare with when the problem does happen
As luck would have it it's running super stable at the moment. I've tried upgrading multiple times over the last few days with the same result every time. Full system lockup within an hour or two with python3 pegging 1 CPU at 100%. Will keep the logs going for a while still.
In between this test and the last one, I did update zigbee2mqtt. Perhaps this may help with troublehsooting.
Woke up this morning to a completely locked up and unresponsive system. Had to hard reset the host as it was unresponsive. No profiler but here's the huge log. home-assistant.log.zip
Reverting back again. This release is definitely broken for me. Woke up to a cold house as no automation for the heat worked.
Woke up this morning to a completely locked up and unresponsive system. Had to hard reset it. No profiler but here's the huge log. home-assistant.log.zip
Reverting back again. This release is definitely broken for me. Woke up to a cold house as no automation for the heat worked.
@bdraco have you guys changed how automations speak to the event bus? Looking at this it seems one of my automations that has worked fine for years is going haywire and triggering every millisecond.
Whatever it is you guys did needs reversing. This is a native automation, using a native HA component.
That automation is due to your native gree component losing connectivity to the AC so I wait for it to become available again and resend the command it missed.
Automation is below:
alias: Raise temperature error check
description: ""
trigger:
- platform: state
entity_id:
- automation.raise_temperature
attribute: last_triggered
id: Automation Start
for:
hours: 0
minutes: 0
seconds: 0
- platform: state
entity_id:
- automation.vero_6_8
attribute: last_triggered
id: 6-8
- platform: state
entity_id:
- climate.climatisation
from: unavailable
to: heat
id: Unavailable
- platform: state
entity_id:
- automation.raise_temperature_end
attribute: last_triggered
id: Automation End
for:
hours: 0
minutes: 0
seconds: 0
condition: []
action:
- choose:
- conditions:
- condition: trigger
id:
- Automation Start
sequence:
- if:
- condition: or
conditions:
- condition: state
entity_id: climate.climatisation
state: unavailable
- condition: numeric_state
entity_id: climate.climatisation
below: 23
attribute: temperature
- condition: state
entity_id: input_boolean.raise_temperature_on
state: "off"
then:
- wait_for_trigger: []
timeout:
hours: 0
minutes: 5
seconds: 0
milliseconds: 0
continue_on_timeout: false
- service: automation.trigger
metadata: {}
data:
skip_condition: false
target:
entity_id: automation.raise_temperature
alias: Raise Temperature Start
- conditions:
- condition: trigger
id:
- 6-8
sequence: []
- conditions:
- condition: trigger
id:
- Automation End
sequence:
- if:
- condition: or
conditions:
- condition: state
entity_id: climate.climatisation
state: unavailable
- condition: numeric_state
entity_id: climate.climatisation
attribute: temperature
above: 22
- condition: state
entity_id: input_boolean.raise_temperature_on
state: "on"
then:
- wait_for_trigger: []
timeout:
hours: 0
minutes: 5
seconds: 0
milliseconds: 0
continue_on_timeout: false
- service: automation.trigger
metadata: {}
data:
skip_condition: true
target:
entity_id: automation.raise_temperature_end
alias: Raise Temperature End
mode: single
Nothing fancy.
@bdraco Would your timer fix do anything for me? I can give it a try in 2024.4.2 I'll be home for a while and can maybe run a profiler if the host goes nuts again.
This one : https://github.com/home-assistant/core/pull/115110
I have yet to fully step through it, though. My day job is hectic right now, so I only have a little time for HA. I'll take another look before the weekend if I get free cycles. From a cursory review, I don't think so since your automation doesn't look like you have any until
or while
conditions.
However, your automation is calling several other automations, so please post those as well.
I have yet to fully step through it, though. My day job is hectic right now, so I only have a little time for HA. I'll take another look before the weekend if I get free cycles. From a cursory review, I don't think so since your automation doesn't look like you have any
until
orwhile
conditions.However, your automation is calling several other automations, so please post those as well.
Here you go. First one:
alias: Raise Temperature
description: ""
trigger:
- platform: numeric_state
entity_id: climate.thermostat_cuisine
for:
hours: 0
minutes: 15
seconds: 0
below: 20.6
attribute: current_temperature
id: LowTemp Thermostat
- platform: state
entity_id:
- binary_sensor.puissance_thermostat_en_haut
to: "on"
id: Puissance bool
- platform: numeric_state
entity_id: sensor.thermostat_cuisine_power
above: 1500
id: Puissance
- platform: numeric_state
entity_id:
- sensor.sensorsaloncuisine_temperature
for:
hours: 0
minutes: 15
seconds: 0
below: 21.8
id: Sensor
- platform: state
entity_id:
- sensor.defi_hilo
id: Défi
to: pre_heat
- platform: state
entity_id:
- input_boolean.heat_boost_start
to: "on"
id: Manuel
condition: []
action:
- choose:
- conditions:
- condition: trigger
id:
- Manuel
- Sensor
- condition: or
conditions:
- condition: time
before: "22:00:00"
after: "05:30:00"
- condition: state
entity_id: sensor.defi_hilo
state: pre_heat
- condition: not
conditions:
- condition: state
entity_id: sensor.season
state: summer
- condition: state
entity_id: input_boolean.raise_temperature_on
state: "on"
- condition: state
entity_id: climate.climatisation
state: cool
- condition: or
conditions:
- condition: state
entity_id: person.vero
state: home
- condition: state
entity_id: person.ian_couture
state: home
- condition: state
entity_id: input_boolean.prechauffe_pm_debutee
state: "on"
- condition: state
entity_id: input_boolean.gardienne_chouchoune
state: "on"
- condition: state
entity_id: input_boolean.prechauffe_pm_debutee_permissive
state: "on"
- condition: or
conditions:
- condition: state
entity_id: input_boolean.chauffage_chambres
state: "on"
- condition: state
entity_id: input_boolean.chauffage
state: "on"
- condition: numeric_state
entity_id: climate.thermostat_cuisine
attribute: temperature
above: 19
sequence:
- service: climate.set_temperature
data:
temperature: >-
{{max(state_attr('climate.climatisation',
'current_temperature'), 22) | int + 2}}
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- service: climate.set_preset_mode
data:
preset_mode: boost
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- service: input_boolean.turn_on
data: {}
target:
entity_id: input_boolean.raise_temperature_on
- delay:
hours: 0
minutes: 5
seconds: 0
milliseconds: 0
- service: input_boolean.turn_off
data: {}
target:
entity_id: input_boolean.heat_boost_start
- service: climate.set_preset_mode
data:
preset_mode: none
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- conditions:
- condition: trigger
id:
- LowTemp Thermostat
- Puissance
- Défi
- Puissance bool
- condition: or
conditions:
- condition: time
before: "22:00:00"
after: "05:30:00"
- condition: state
entity_id: sensor.defi_hilo
state: pre_heat
- condition: not
conditions:
- condition: state
entity_id: sensor.season
state: summer
- condition: state
entity_id: input_boolean.raise_temperature_on
state: "on"
- condition: state
entity_id: climate.climatisation
state: cool
- condition: or
conditions:
- condition: state
entity_id: person.vero
state: home
- condition: state
entity_id: person.ian_couture
state: home
- condition: state
entity_id: input_boolean.prechauffe_pm_debutee
state: "on"
- condition: state
entity_id: input_boolean.gardienne_chouchoune
state: "on"
- condition: state
entity_id: input_boolean.prechauffe_pm_debutee_permissive
state: "on"
- condition: or
conditions:
- condition: state
entity_id: input_boolean.chauffage_chambres
state: "on"
- condition: state
entity_id: input_boolean.chauffage
state: "on"
- condition: numeric_state
entity_id: climate.thermostat_cuisine
attribute: temperature
above: 19
sequence:
- service: climate.set_temperature
data:
temperature: >-
{{max(state_attr('climate.climatisation',
'current_temperature'), 22) | int + 2}}
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- service: climate.set_preset_mode
data:
preset_mode: boost
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- service: input_boolean.turn_on
data: {}
target:
entity_id: input_boolean.raise_temperature_on
- delay:
hours: 0
minutes: 5
seconds: 0
milliseconds: 0
- service: input_boolean.turn_off
data: {}
target:
entity_id: input_boolean.heat_boost_start
- service: climate.set_preset_mode
data:
preset_mode: none
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
mode: single
Second one:
alias: Raise temperature end
description: Retour à la normale
trigger:
- platform: state
entity_id:
- binary_sensor.temperature_atteinte
to: "on"
from: "off"
for:
hours: 0
minutes: 5
seconds: 0
- platform: numeric_state
entity_id: sensor.thermostat_cuisine_temperature
above: 21.5
for:
hours: 0
minutes: 5
seconds: 0
- platform: time
at: "21:59:00"
- platform: state
entity_id:
- binary_sensor.puissance_thermostat_en_haut
to: "off"
for:
hours: 0
minutes: 5
seconds: 0
- platform: state
entity_id:
- sensor.defi_hilo
to: reduction
- platform: state
entity_id:
- input_boolean.heat_boost_stop
to: "on"
- platform: numeric_state
entity_id:
- sensor.sensorsaloncuisine_temperature
for:
hours: 0
minutes: 5
seconds: 0
above: 22.5
condition:
- condition: state
entity_id: input_boolean.raise_temperature_on
state: "on"
action:
- service: climate.set_temperature
data:
temperature: 22
target:
device_id: 64bfa1752510b1baf9d7130e80a621ed
- service: input_boolean.turn_off
data: {}
target:
entity_id:
- input_boolean.raise_temperature_on
- input_boolean.prechauffe_pm_debutee
- input_boolean.heat_boost_stop
- service: climate.set_preset_mode
data:
preset_mode: none
target:
entity_id: climate.climatisation
mode: single
I have yet to fully step through it, though. My day job is hectic right now, so I only have a little time for HA. I'll take another look before the weekend if I get free cycles. From a cursory review, I don't think so since your automation doesn't look like you have any
until
orwhile
conditions.However, your automation is calling several other automations, so please post those as well.
Updated to 2024.4.2 and will be actively monitoring it. Will try and run profiler before it locks up if it does.
You have a loop in your first automation; once automation.raise_temperature
is triggered by your automation, it will continue triggering as fast as the system can loop.
You have two of these blocks in your first automation:
- wait_for_trigger: [] timeout: hours: 0 minutes: 5 seconds: 0 milliseconds: 0 continue_on_timeout: false
This block does nothing since you don't specify a trigger. I just tested this, and HA will interpret a null
trigger as one that triggered and so the automation will continue immediately without delay.
Were there formerly triggers specified here?
It should be the unavailable trigger id in there.
Le lun. 8 avr. 2024 à 19:02, Rick Auch @.***> a écrit :
You have a loop in your first automation; once automation.raise_temperature is triggered by your automation, it will continue triggering as fast as the system can loop.
You have two of these blocks in your first automation:
- wait_for_trigger: [] timeout: hours: 0 minutes: 5 seconds: 0 milliseconds: 0 continue_on_timeout: false
This block does nothing since you don't specify a trigger. I just tested this, and HA will interpret a null trigger as one that triggered and so the automation will continue immediately without delay.
Were there formerly triggers specified here?
— Reply to this email directly, view it on GitHub https://github.com/home-assistant/core/issues/115072#issuecomment-2043775925, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZZGCFP6NDYO5YMP22OGHVTY4MOX7AVCNFSM6AAAAABF25RMP2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBTG43TKOJSGU . You are receiving this because you authored the thread.Message ID: @.***>
I upgraded to 2024.4 and had to restore from a backup as my mobile device UI is unusable. Going back to the previous release all worked for a bit and then I decided to try an upgrade again. I'm currently running 2024.4.2.
I think I've found why my system was slow. but I have no idea what caused it. My fix was adding more RAM. With no changes other than an upgrade to 2024.4.0 I've had to bump up the RAM allocated to my VM from 8GB to 16GB. When I have it set to 8GB I now have around 5GB of swap space being used!
My issues have all been caused by a bug in the timer logic in 2024.4.x releases. The issue for me is due to having a timer as a trigger in an automation. When the timer changes from active to idle the automation is triggered. The trouble is, I need to add a delay of 1 second after the timer went idle before triggering otherwise HA keeps triggering the automation.
trigger:
- platform: state
entity_id:
- timer.office_lights
for:
hours: 0
minutes: 0
seconds: 1
to: idle
from: active
You can see in this pic, the drop on CPU demand at around 11:41 when I made the change.
I've still yet to figure out why it now has a 90% CPU demand around every 15min tho :-(
I need to add a delay of 1 second after the timer went idle before triggering otherwise HA keeps triggering the automation.
trigger: - platform: state entity_id: - timer.office_lights for: hours: 0 minutes: 0 seconds: 1 to: idle from: active
Sounds like you also have a loop. Can you share the full automation?
2024.4 has included some significant improvements to speed, and therefore is uncovering some issues with users' incorrectly programmed automations & scripts where loops without interrupts exist.
I think I found it. In one part of my routine, I call timer.cancel on the office_lights timer. This seems to send the timer to active and back to idle again even when the timer was already idle. That was triggering the automation again.
alias: Automation - Office Occupancy
description: ""
trigger:
- platform: state
entity_id:
- sensor.office_motion_sensor
to: "1"
- platform: state
entity_id:
- timer.office_lights
for:
hours: 0
minutes: 0
seconds: 0
to: idle
from: active
condition: []
action:
- if:
- condition: state
entity_id: input_boolean.office_occupied
state: "off"
then:
- service: input_boolean.turn_on
metadata: {}
data: {}
target:
entity_id:
- input_boolean.office_occupied
else:
- if:
- condition: state
entity_id: timer.office_lights
state: idle
then:
- service: input_boolean.turn_off
metadata: {}
data: {}
target:
entity_id: input_boolean.office_occupied
- if:
- condition: state
entity_id: input_boolean.office_occupied
state: "on"
then:
- if:
- condition: state
entity_id: timer.livingroom_occupied
state: active
then:
- service: timer.cancel
target:
entity_id: timer.livingroom_occupied
data: {}
- if:
- condition: time
after: "00:00:00"
before: "06:00:00"
weekday:
- sun
- sat
- fri
- thu
- wed
- tue
- mon
then:
- service: timer.start
metadata: {}
data:
duration: "0:1:0"
target:
entity_id: timer.office_lights
else:
- service: timer.start
metadata: {}
data: {}
target:
entity_id: timer.office_lights
else:
- service: timer.finish
metadata: {}
data: {}
target:
entity_id: timer.office_lights
mode: single
@bdraco I think the leading theory here is that the way timers are handled change and would cause loops.
I corrected my automation by creating another automation to act as a trigger in my wait statement. This seems to have fixed the issue.
Would you like me to close the issue or would you like to use it to improve documentation? I can try my hand at better documenting this and making a PR for it if you want, just point me in the general direction of where.
Hmmm interesting, can you guess when I updated to the latest HASSOS v12.2
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
The problem
Updating my docker container to 2024.4.1 causes CPU to climb to 100% over the span of a few hours. It never goes down. The only fix is to revert back to 2024.3.3.
Something is wrong with this release.
What version of Home Assistant Core has the issue?
2024.4.1
What was the last working version of Home Assistant Core?
2024.3.3
What type of installation are you running?
Home Assistant Container
Integration causing the issue
recorder
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response