sermatec-opensource / homeassistant-sermatec-inverter

Home Assistant custom component for the Sermatec solar inverter.
MIT License
9 stars 2 forks source link

Add fault tolerance #14

Closed andreondra closed 5 months ago

andreondra commented 1 year ago

Do not report Unavailable if other values could be retrieved. Now the integration gives up on every failure and waits for the next refresh. This should not happen.

This typically happens when there is more than one connection to the UART-TCP module (e.g. simultaneous connection from the official app).

mathieupotier commented 1 year ago

hi, don't know if it's related to this issue but got : 2023-01-06 22:01:01.593 ERROR (MainThread) [custom_components.sermatec_inverter.sensor] Error fetching Sermatec data: Can't retrieve working parameters. it seems occurring after 5 minutes of activity, got a similar issue on the official app, it stops to poll the data (or get error)...

andreondra commented 1 year ago

It is related, the script tries three times and if unsuccessful it reports all sensors as unavailable and retries during the next refresh (after 30 seconds).

There are two alternatives to consider: a) We can probably show cached last values and report Unavailable after e.g. 5 mins not 30 seconds as it is now. But this would cause inaccurate readings. b) Better alternative: we would report unavailable immediately but only for the failed part (e.g. battery values), not for all values. This is the way the official app works, it tries to read all values and shows the ones it managed to read, this integration just gives up on any failure. But I don't know how to report unavailable only for specific sensors.

However, the readings are always successful if there is only the integration connected to the inverter. So it can be solved by connecting other devices only when needed or by using your proxy @mathieupotier as you mentioned in another issue (this would be the most elegant solution).

To sum up, I suggest implementing the b) and if you publish your proxy we can mention it here as an option for connecting multiple devices reliably.

mathieupotier commented 1 year ago

My proxy will sure help the dev process, but i'm not sure it's a suitable solution, it needs my proxy to run permanently somewhere in your network / or in the home-assistant instance... For sure it will be designed to handle that kind of issue, but it make things a little bit more complex for this... IMO I think we'll discuss using my proxy later ... ^_^

I think your client also lack of error reporting, there is no difference between "cannot connect", "connection lost", or "checksum error". Related issue : https://github.com/andreondra/sermatec-inverter/issues/12 I suggest enforcing the client side, reporting error that can be handled in the sensors retrieval process. It could ease fixing the issue. I can often fix the issue on my HA instance when it occurs, by reloading the integration itself on UI... it's berserk, but it works ... ^_^

For me the solution will come with clear error raised from the client, and smart error treatment on the custom_component.

I started to work on the first part (couple days ago) but had issue connecting to my inverter ... and i'm not very efficient with python yet ... ^_^

andreondra commented 1 year ago

I agree with you, let's first fix the error reporting in the client. :) Thanks for your suggestions.

And one more question: If I understand correctly - after the error you mentioned above is shown, the integration stops showing current data until it is reloaded? This issue deals only with the situation when the component fails to get data once/twice but then recovers itself and works correctly. The problem with the integration getting stuck and not updating data until reloaded is documented in this issue: #13, because my instance got stuck even when there was no error.

mathieupotier commented 1 year ago

My integration shows regularly unavailable sensor state, until I reload the integration it remains as is. Seems that it is no more refreshing neither after getting in that state. For me seems that the integration lost the connection with the inverter, and is not able to reconnect by itself... maybe not related, but ... seems quite similar in fact ... ^_^

andreondra commented 1 year ago

@mathieupotier Well, you may be right! I'll have a look at it today.

andreondra commented 5 months ago

Fault tolerance was added in the latest version.