torbennehmer / hacs-e3dc

Homeassistant E3DC Integration
GNU Affero General Public License v3.0
53 stars 8 forks source link

Communication error with E3DC: Max retries reached #124

Closed Phoenix-DH closed 3 months ago

Phoenix-DH commented 3 months ago

System Health details

System Information

version core-2024.3.1
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.1.73-haos-raspi
arch aarch64
timezone Europe/Berlin
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.34.0 Stage | running Available Repositories | 1409 Downloaded Repositories | 35
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 12.1 -- | -- update_channel | stable supervisor_version | supervisor-2024.03.0 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 439.4 GB disk_used | 45.8 GB healthy | true supported | true board | rpi4-64 supervisor_api | ok version_api | ok installed_addons | deCONZ (6.23.0), UniFi Network Application (3.0.3), Terminal & SSH (9.10.0), Studio Code Server (5.15.0), Samba share (12.3.1), InfluxDB (5.0.0), Grafana (9.2.0), Glances (0.21.0), OneDrive Backup (2.3.1), AdGuard Home (5.0.4), Let's Encrypt (5.0.15)
Dashboards dashboards | 4 -- | -- resources | 23 views | 14 mode | storage
Recorder oldest_recorder_run | March 13, 2024 at 5:59 AM -- | -- current_recorder_run | March 18, 2024 at 1:13 PM estimated_db_size | 1099.65 MiB database_engine | sqlite database_version | 3.44.2
Sonoff version | 3.6.0 (8dd8af9) -- | -- cloud_online | 2 / 2 local_online | 2 / 2

Checklist

Describe the issue

I have againt the following message on my system:

This error originated from a custom integration.

Logger: custom_components.e3dc_rscp.e3dc_proxy Source: custom_components/e3dc_rscp/e3dc_proxy.py:29 integration: E3DC Remote Storage Control Protocol (Git) (documentation, issues) First occurred: 12:00:49 AM (20 occurrences) Last logged: 12:04:00 AM

Communication error with E3DC: Max retries reached Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 226, in sendRequest result = self.rscp.sendRequest(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc_rscp_local.py", line 117, in sendRequest raise CommunicationError(receive[2]) e3dc._e3dc_rscp_local.CommunicationError: UNKNOWN

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/config/custom_components/e3dc_rscp/e3dc_proxy.py", line 29, in wrapper_handle_e3dc_ex return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/e3dc_rscp/e3dc_proxy.py", line 116, in get_db_data return self.e3dc.get_db_data_timestamp(timestamp, timespan_seconds, True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 710, in get_db_data_timestamp response = self.sendRequest( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 237, in sendRequest raise SendError("Max retries reached") e3dc._e3dc.SendError: Max retries reached

and

This error originated from a custom integration.

Logger: custom_components.e3dc_rscp.coordinator Source: helpers/update_coordinator.py:318 integration: E3DC Remote Storage Control Protocol (Git) (documentation, issues) First occurred: 12:00:49 AM (20 occurrences) Last logged: 12:04:00 AM

Unexpected error fetching e3dc_rscp data: Communication Failure: Failed to send data Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 226, in sendRequest result = self.rscp.sendRequest(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc_rscp_local.py", line 117, in sendRequest raise CommunicationError(receive[2]) e3dc._e3dc_rscp_local.CommunicationError: UNKNOWN

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/config/custom_components/e3dc_rscp/e3dc_proxy.py", line 29, in wrapper_handle_e3dc_ex return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/e3dc_rscp/e3dc_proxy.py", line 116, in get_db_data return self.e3dc.get_db_data_timestamp(timestamp, timespan_seconds, True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 710, in get_db_data_timestamp response = self.sendRequest( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc.py", line 237, in sendRequest raise SendError("Max retries reached") e3dc._e3dc.SendError: Max retries reached

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 318, in _async_refresh self.data = await self._async_update_data() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/e3dc_rscp/coordinator.py", line 169, in _async_update_data db_data_today: dict[str, Any] = await self.hass.async_add_executor_job( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/e3dc_rscp/e3dc_proxy.py", line 37, in wrapper_handle_e3dc_ex raise HomeAssistantError( homeassistant.exceptions.HomeAssistantError: Communication Failure: Failed to send data

I am not pretty sure if we are talking about the same than earlier in https://github.com/torbennehmer/hacs-e3dc/issues/75

Wan't to double check if this is a new one. To be honest, debugging is currently still off.

Reproduction steps

Having a look to the logs

Debug logs

Currently still off

Diagnostics dump

No response

torbennehmer commented 3 months ago

I'll have to forward this to https://github.com/fsantini/python-e3dc, unfortunately, I can't say how long a resolution will take as this seems rather strange. I'll cc you there.

Waiting for fsantini/python-e3dc#115

Phoenix-DH commented 3 months ago

@torbennehmer Great!

torbennehmer commented 3 months ago

@Phoenix-DH How often do you see this error? What side-effects does it have? Can you download the diagnostics dump from e3dc? Can you please enable debug logging and send me the next error that occures along with the information immediately before it? hacs-e3dc did pull the daily stats at that point and I don't understand what's going wrong.

Phoenix-DH commented 3 months ago

Bad thing is that my HA has just been restarted. Log gone. But if I got it right in my mind, both only one or twice a day. As it seems all 12 hours short past 12, always the same.

Will try to gather the debug logs. Enabling only for this integration right?

And the data directly from my e3dc? Is that save? Had to take a look how.

thijsdewitt commented 3 months ago

I'm seeing the same error on my system, about once every minute. I'm not sure what the side effects are. Here is my log:

home-assistant_e3dc_rscp_2024-03-17T20-50-27.085Z.log

Pe2023Ro commented 3 months ago

Good Morning, Same at my System. But only at S10x (Master of farm). The additional Inverter S10SE works well. I‘m not in the latest state of HA Core (2023.11.1) and also E3DC Integration is not the latest (3.4.3) Could that be a reason? No Side effects so far.

EDIT:

Now it works again

Phoenix-DH commented 3 months ago

I still have the issue. This tine 00:04 with no other issues or side effects. During the week I will enable debugging if still needed with that already uploaded obe.

tschrenk commented 3 months ago

I am having the same issue. The E3dc device was working since nearly a year. The effect is that the parameters maximum charge+discharche, Autarky.today, Battery charge - today, Battery discharge -today, Manual charge and Solar -production today are not available in HA. At HA going to settings -> devices and services shows the service red saying "Einrichtung fehlgeschlagen, wird erneut versucht" and when you hover over it configuration not yet ready. Maxtries reached. Reloading does not help, also restored earlier backups, and deleted and reconfigured E3DC device, also rebooted E3DC S10X and shut down and physically switched offf batteries and everything.

In the logs I see Unexpected error fetching e3dc_rscp data: Max retries reached Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc_rscp_local.py", line 102, in sendRequest receive = self._receive() ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_e3dc_rscp_local.py", line 76, in _receive decData = rscpDecode(self.encdec.decrypt(data))[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_rscpLib.py", line 259, in rscpDecode return rscpDecode(rscpFrameDecode(data)[0]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_rscpLib.py", line 275, in rscpDecode innerData, usedLength = rscpDecode(data[curByte:]) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_rscpLib.py", line 266, in rscpDecode strTag = getStrRscpTag(hexTag) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/e3dc/_rscpTags.py", line 3692, in getStrRscpTag tag = RscpTag(tag) ^^^^^^^^^^^^ File "/usr/local/lib/python3.12/enum.py", line 744, in call return cls.new(cls, value) ^^^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.12/enum.py", line 1158, in new raise ve_exc ValueError: 16777278 is not a valid RscpTag

Side effect is that the battery control automation is not working anymore as the maximum charge+discharche entities are not available anymore.

Die Automatisierung „SOC 25-80 L12000/E12000“ („automation.soc_0_80_clear_power_limit“) hat eine Aktion, die einen unbekannten Dienst aufruft: „e3dc_rscp.set_power_limits“.

Dieser Fehler verhindert, dass die Automatisierung ordnungsgemäß ausgeführt wird. Vielleicht ist dieser Dienst nicht mehr verfügbar oder vielleicht hat ein Tippfehler ihn verursacht.

Another side effect is that the the SOC history statistic chart is not visible anymore but says no statistics found

It uses Statistic: State of charge and sensor.e3dc_solar_power

Also tried to reconnect the device and specified IP, local.user, password and RSVP. Can configure it again but it also gets red and in the display the values that were unavailable before show that the entity is not found.

Happy to provide additional data to solve the issue.

home-assistant_e3dc_rscp_2024-03-29T01-02-40.537Z.log

tschrenk commented 3 months ago

After additional searching I found that this issue is released with hacs-e3dc V3.6.1

This solves the issue:

  1. in HA go to HACS tab
  2. update to HACS 3.6.1
  3. restart HA

All is good again :-)

torbennehmer commented 3 months ago

Hi @tschrenk,

File "/usr/local/lib/python3.12/enum.py", line 1158, in new raise ve_exc ValueError: 16777278 is not a valid RscpTag

This is the behaviour, we've seen in #114, please follow it there, 3.6.1 only has a workaround for it. I'll hide the comments from this issue, as this is another root cause.

Phoenix-DH commented 3 months ago

Just installed, thanks. Will provide feedback.

torbennehmer commented 3 months ago

@thijsdewitt

I'm seeing the same error on my system, about once every minute. I'm not sure what the side effects are. Here is my log:

home-assistant_e3dc_rscp_2024-03-17T20-50-27.085Z.log

As far as I understand the log, the E3DC was temporarily unreachable. I too see this from time to time, probably when E3DC is updating (or whatever). Also, there seem to be problems directly at midnight with DB Data Polling (see also #113). 3.6.3 beta-1 introduces a better error handling for testing (if you're willing to). It should no longer propagate communication error exceptions out of the update loop, instead you'll see simpler warnings stating that we can't talk to E3DC at the moment.

Be aware, that this is related to E3DC being unreachable, not the integration misbehaving. At least as far as I understand it at the moment.

The Tag Errors we've seen (unknown tag in manual charging data polling) something else, which I track on #114.

Please check if 3.6.3 solves your problem given these provisions - either use the beta or wait for stable.

@Pe2023Ro

  • E3DC integration updated to 3.6.2

You've probably been affected by #114 as well, 3.6.1 handles this gracefully until we get the communication working again.

@Phoenix-DH

This tine 00:04 with no other issues or side effects.

I believe you're seeing the same nightly outage in DB Data telemetry, so 3.6.3 should take care of the hard errors as well.

For all of you, this currently means, that on midnight the statistics are a few minutes off, you will still see warnings, but that is to be expected on an E3DC outage. If you still experience funny things beyond that on midnight, please follow up on #130. Right now, We can't do much about it, as E3DC isn't answering our queries at that time for whatever reason. Nothing I can fix, just work around.

Thus, I'll close this issue for now. Please reopen if the graceful error handling doesn't work for you.

ubarni commented 2 months ago

I just got the same error and found this thread. In my case (after checking the last answer) i found out that my Unifi Flex Switch went offline and thats why the network connection was broken to my E3DC. So if you get this error? Check if the world symbol on the E3DC display is GREEN and not RED.

Phoenix-DH commented 2 months ago

That would mean your switch will loose connection each night? Can‘t believe that. But I have also a Unify switch.

Will see if there is something visible on the controller.

edit: Just had a look tonthe audit log where I see the reconnects. Nothing …

ubarni commented 2 months ago

That would mean your switch will loose connection each night? Can‘t believe that. But I have also a Unify switch.

Will see if there is something visible on the controller.

edit: Just had a look tonthe audit log where I see the reconnects. Nothing …

No @Phoenix-DH, in my case it was a random outage but with the same log entries. And @torbennehmer is right saying: "the E3DC was temporarily unreachable"

When is saw Communication error with E3DC: Max retries reached i first thought i had send to many requests to my E3DC? But glad to see that it is a "simple" connection problem.