PimDoos / onesmartcontrolha

Home Assisttant integration for One Smart Control server
Apache License 2.0
4 stars 1 forks source link

High CPU after socket reconnect #12

Closed PimDoos closed 1 year ago

PimDoos commented 1 year ago

After below warnings, processor load increased. After the first increase I restarted, but as you can see, it came back after some time.

Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:177
Integration: One Smart Control (documentation, issues)
First occurred: 13:01:01 (12 occurrences)
Last logged: 15:44:43

Could not update '['boiler_flow_temperature', 'water_tank_temperature', 'return_temperature', 'flow_temperature']' for 'HEATPUMP': {'error': 3, 'description': 'timeout'}
Could not update '['heat_source_status', 'seven_segment_display_error_code', 'refrigerant_error_info', 'defrost']' for 'HEATPUMP': {'error': 3, 'description': 'timeout'}
Could not update '['outlet_air_percentage', 'inlet_air_percentage']' for 'COMFOAIR': {'error': 3, 'description': 'timeout'}
Could not update '['error_a', 'co2_level_first_floor', 'co2_level_ground_floor', 'exit_air_temperature']' for 'COMFOAIR': {'error': 3, 'description': 'timeout'}
Command timed out after 60 seconds: apparatus
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:189
Integration: One Smart Control (documentation, issues)
First occurred: 15:44:43 (3 occurrences)
Last logged: 15:44:46

Command timed out after 60 seconds: ping
Ping to server timed out. Reconnecting.
Connection error on socket push: '[Errno 113] Host is unreachable' Reconnecting in 60 seconds.

afbeelding

Originally posted by @dutch-erik in https://github.com/PimDoos/onesmartcontrolha/issues/8#issuecomment-1230433634

PimDoos commented 1 year ago

I am able to reproduce this issue in my production instance. After this, all sensors stop updating and flatline, except for the push sensors (P1/Phase power sensors) image image

Strangely enough, my development instance does not have this issue. I suspect the polling socket does not reconnect correctly, crashing the wrapper loop.

PimDoos commented 1 year ago

Might be fixed in #15, which will be in the v0.1.3 release.

dutch-erik commented 1 year ago

Hi Pim, I'm afraid it's not solved. After 12:41h processor and temperature went up.

Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:159
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:22 (1778 occurrences)
Last logged: 16:38:32

    Connection error: [Errno 113] Host is unreachable
    Connection error: [Errno 111] Connection refused
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:243
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:23 (355 occurrences)
Last logged: 16:38:32
Error in push gateway wrapper: [Errno 107] Socket not connected
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:213
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:15 (361 occurrences)
Last logged: 16:38:32

    Ping to server timed out. Reconnecting.
    Reconnect failed after 4 attempts.
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:167
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:18 (5 occurrences)
Last logged: 16:38:14

    Connection error: [Errno 113] Host is unreachable
    Connection error: EOF occurred in violation of protocol (_ssl.c:997)
    Connection error: [Errno 111] Connection refused
    Connection timeout out after 10 seconds
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:163
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:15 (7 occurrences)
Last logged: 16:38:04
Command timed out after 60 seconds: ping
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:199
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 10:31:02 (12 occurrences)
Last logged: 12:41:54

    Could not update '['pv2_voltage', 'pv1_voltage', 'alarm_2', 'alarm_1']' for 'HUAWEI SUN2000': Timed out
    Could not update '['heat_source_status', 'seven_segment_display_error_code', 'refrigerant_error_info', 'defrost']' for 'HEATPUMP': {'error': 3, 'description': 'timeout'}
    Could not update '['outlet_air_percentage', 'inlet_air_percentage']' for 'COMFOAIR': {'error': 3, 'description': 'timeout'}
    Could not update '['error_a', 'co2_level_first_floor', 'co2_level_ground_floor', 'exit_air_temperature']' for 'COMFOAIR': {'error': 3, 'description': 'timeout'}
    Could not update '['operating_status', 'standalone_teleindication', 'model_id']' for 'HUAWEI SUN2000': Timed out
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:396
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:15 (3 occurrences)
Last logged: 12:41:54
Command timed out after 60 seconds: apparatus
Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:187
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:56:15 (1 occurrences)
Last logged: 11:56:15
Ping to server timed out. Reconnecting.
PimDoos commented 1 year ago

Looks like the error catching works now :) Looking at your first log, this seems to be the loop causing the high CPU usage:

Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:159
Integration: One Smart Control ([documentation](https://github.com/PimDoos/onesmartcontrolha), [issues](https://github.com/PimDoos/onesmartcontrolha/issues))
First occurred: 11:55:22 (1778 occurrences)
Last logged: 16:38:32

    Connection error: [Errno 113] Host is unreachable
    Connection error: [Errno 111] Connection refused

I got a similar result on my production instance today, along with an SSL error. Especially the 'host is unreachable' and 'connection refused' errors seems peculiar to me. Looks like reconnection fails rapidly, causing an infinite loop somewhere. Maybe the integration is causing the One Server is doing some sort of throttling. I'll keep on digging.

This error originated from a custom integration.

Logger: root
Source: custom_components/onesmartcontrol/onesmartwrapper.py:159
Integration: One Smart Control (documentation, issues)
First occurred: 18:27:27 (3253 occurrences)
Last logged: 18:27:41

Connection error: [Errno 111] Connection refused
PimDoos commented 1 year ago

Still working on this, I'm pushing a couple of patches that might solve this in #17. So far my main suspect is a call causing an infinite loop in onesmartsocket.py. I hope to release the patches this week.

PimDoos commented 1 year ago

My dev and production instances have been stable for 24 hours now, so you will find these new changes in v0.2.0. The integration seems a lot more stable now (there are still some timeout errors, but that is mostly due to the One Smart server timing out the request). Feel free to (re)open an issue if new problems arise.