home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.18k stars 30.2k forks source link

Enphase Envoy Stops Communicating With Home Assistant #121473

Closed rljk99 closed 2 months ago

rljk99 commented 3 months ago

The problem

My Envoy has been online with Home Assistant for 7-days and it has stopped communicating with HA twice. When this happens all of the entities associated with it become "unavailable" and recording energy production and consumption stops.

I have logged on to Enphase Enlighten and found that my system remained online with the "cloud" and continued to report data to "Enlighten"

I have TP-Link and EcoNet integrations also running on my system, they appear to be unaffected and continue running without issues.

My envoy (gateway) is running software version D8.2.127.231214

Restarting Home Assistant is the only way to get communications with the Envoy started again.

I have configured a custom polling interval of 10-seconds using the instructions located here "https://www.home-assistant.io/common-tasks/general/#defining-a-custom-polling-interval"

What version of Home Assistant Core has the issue?

2024.6.1

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Enphase

Link to integration documentation on our website

https://www.home-assistant.io/integrations/enphase_envoy

Diagnostics information

home-assistant.log home-assistant.1.log

Example YAML snippet

No response

Anything in the logs that might be useful for us?

In log Home-assistant.1.log there is an entry at time 2024-07-06 23:02:55.931 that says "Unexpected error fetching Envoy 202349008148 data". This is when the most recent communication failure started. The number"202349008148" is the serial number of my envoy.

Additional information

No response

home-assistant[bot] commented 3 months ago

Hey there @bdraco, @cgarwood, @dgomes, @joostlek, @catsmanac, mind taking a look at this issue as it has been labeled with an integration (enphase_envoy) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `enphase_envoy` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign enphase_envoy` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


enphase_envoy documentation enphase_envoy source (message by IssueLinks)

catsmanac commented 3 months ago

Hi @rljk99, the log reveals a connection timeout when HA tries to collect the data.

The time when this one happens is noteworthy, shortly after 11 PM. This 'shortly after 11 PM' is a known issue and reported on multiple occasions. It is believed to be an internal reset/reload of a/some off/all off the internal applications.

When the production data is requested at that moment it will time out or not accept connection for a brief time period. Communication resumes after the outage.

The scan cycle of 10 seconds make chances of hitting this 'black hole in time' just bigger, it may be the reason for twice in 7 days. Wouldn't be surprised the other outage was around the same time. There's mixed reports if the Envoy can keep up with the 10 sec scan, it might get bogged down every now and then at that rate. THat might result in similar 'unavailable issues'

Can't make it any better. As for options, slower scanning rate reduces the chance for the issue, but does not zero it. Maybe add some smarts to the scan automation like skipping first 2 minutes after 11 PM. (Just thinking out load here)

rljk99 commented 2 months ago

Thank you for the information.

If I did it right, my scanning automation will not trigger between 22:59 and 23:05 each day. I have another automation involving enphase entities that runs at 23:00 and 23:05 each day, this one will no longer trigger between 22:59 and 23:05 each day as well.

In addition (though probably not necessary) I set all utility meter entities that use enphase entities as inputs to “sensor always available”.

Hopefully these steps will allow HA to go “radio silent” while the envoy does it’s thing.

As enphase systems go, mine is pretty small, only 12-modules/microinverters and 1-envoy, no system controller, no batteries. I’m hoping that because the system is so small, the 10-sec scan is not overwhelming for the envoy.

Now I sit back and wait to see if the problem goes away.

Thanks again

From: Arie Catsman Sent: Monday, July 8, 2024 2:51 AM To: home-assistant/core Cc: rljk99 ; Mention Subject: Re: [home-assistant/core] Enphase Envoy Stops Communicating With Home Assistant (Issue #121473)

Hi @rljk99, the log reveals a connection timeout when HA tries to collect the data.

The time when this one happens is noteworthy, shortly after 11 PM. This 'shortly after 11 PM' is a known issue and reported on multiple occasions. It is believed to be an internal reset/reload of a/some off/all off the internal applications.

When the production data is requested at that moment it will time out or not accept connection for a brief time period. Communication resumes after the outage.

The scan cycle of 10 seconds make chances of hitting this 'black hole in time' just bigger, it may be the reason for twice in 7 days. Wouldn't be surprised the other outage was around the same time. There's mixed reports if the Envoy can keep up with the 10 sec scan, it might get bogged down every now and then at that rate. THat might result in similar 'unavailable issues'

Can't make it any better. As for options, slower scanning rate reduces the chance for the issue, but does not zero it. Maybe add some smarts to the scan automation like skipping first 2 minutes after 11 PM. (Just thinking out load here)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

catsmanac commented 2 months ago

Sounds good. If it works fine you can close the case or let me know and I close it.

As for the scan interval, the Envoy only collects the inverter data every 5 min, spread over that 5 minutes. So you will see the data change in that time span, but when you inspect data from an individual inverter you will see values 5 minutes apart.

This is not the case for the current transformers, if you have these.

rljk99 commented 2 months ago

So far things are working well, the automations that generate enphase traffic don’t fire between 22:59 and 23:05 and there haven’t been any issues. I’m going to wait a few more days just to make sure before I close the case.

I had noticed the inverter behavior earlier (I track the individual inverter output as well as the “last reported” time. The current transformers appear to update at about 10 second intervals, and this is the data that is most important to me.

Thanks for information as when envoy collects inverter data. From: Arie Catsman Sent: Tuesday, July 9, 2024 3:22 AM To: home-assistant/core Cc: rljk99 ; Mention Subject: Re: [home-assistant/core] Enphase Envoy Stops Communicating With Home Assistant (Issue #121473)

Sounds good. If it works fine you can close the case or let me know and I close it.

As for the scan interval, the Envoy only collects the inverter data every 5 min, spread over that 5 minutes. So you will see the data change in that time span, but when you inspect data from an individual inverter you will see values 5 minutes apart.

This is not the case for the current transformers, if you have these.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

rljk99 commented 2 months ago

Things continue to work well as long as enphase traffic is kept to a minimum between 22:59 and 23:05. I think it's safe to close the case.