krahabb / meross_lan

Home Assistant integration for Meross devices
MIT License
415 stars 45 forks source link

MTS200 regularly becomes unavailable in HA #343

Closed andreweacott closed 3 months ago

andreweacott commented 9 months ago

I've recently got a Meross HTS200 thermostat which was set up to work with HA via this integration (v4.4.1). It mostly works but regularly becomes 'unavailable' for hours at a time.

I've tested and have wifi connectivity to the thermostat (can ping it and control it via the Meross app) while HA is reporting unavailable, and restarting HA without touching the thermostat itself also resolves the issue.

Logs show,

2023-12-01 22:09:28.811 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): polling start
2023-12-01 22:09:31.914 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): ClientConnectorError(Cannot connect to host 192.168.1.181:80 ssl:default [Connect call failed ('192.168.1.181', 80)]) in async_http_request GET Appliance.System.All attempt(0)
2023-12-01 22:09:31.914 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): polling end
2023-12-01 22:09:28.811 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): polling start
2023-12-01 22:09:31.914 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): ClientConnectorError(Cannot connect to host 192.168.1.181:80 ssl:default [Connect call failed ('192.168.1.181', 80)]) in async_http_request GET Appliance.System.All attempt(0)
2023-12-01 22:09:31.914 DEBUG (MainThread) [custom_components.meross_lan] MerossDevice(conservatory_floor_thermostat): polling end

But again, the IP is correct and the device is responding normally.

image
krahabb commented 9 months ago

Hello @andreweacott, well, these HTTP connection errors raised from the python module are almost always 'cryptic' about their origin. For this particular case, it could be your host (the HA machine) cannot reach out to the provided address and it could be due either becasue the device is not there...or maybe for some other internal errors of the system. Among these errors there could be also the fact that the environment is running 'out of resources' like if it has too many opened connections or there's something else preventing the OS from reaching out to the remote address

It happened to me something like this and I did resolve only when rebooting HA. For my case, I was also noticing a lot of connection trials/fails from other HA components.

andreweacott commented 9 months ago

The IP address was ping-able from the HA host (HassOS) and I didn't see any other errors being reported from other integrations. I was trying to make an HTTP API request to the device to check it was actually responding but I don't know what URLs it responds to.

Since restarting HA a few days ago, it's had a stable connection so I've set up an alert to tell me if it happens again. Assuming it happens again, can you suggest any diagnostics to perform before I restart HA ?

Also, the docs say that the integration should fall back to MQTT if the HTTP requests don't succeed but I don't see any mention of that happening in the logs. The integration is configured in 'auto' mode ?

krahabb commented 9 months ago

The url to send queries is http://{hostaddress}/config but you have to send application/json content type in order to get a response, not sure if it would reply anyway with an HTTP error or so... (here a quick tutorial on the HTTP api)

For the MQTT to failover it needs some configuration and there are 2 scenarios:

For the latter to work (in meross_lan) you have to setup your Meross cloud profile in the integration UI by entering your Meross credentials and configuring it to allow MQTT publishing. Keep in mind that due to traffic restrictions meross_lan sends messages over the cloud MQTT with a rather 'big' timeout (1 message out every 12 seconds) in order to not incur a possible ban from Meross due to high traffic.

You can monitor the actual state of the protocol machine (i.e. which one is active/available) per device by enabling the 'sensor_protocol' under the diagnostic group of the relative device UI. Also, you have to check the device configuration itself is set to 'Auto' (or even nothing selected which is the default) for the protocol option