briancmpbll / home_assistant_custom_envoy

174 stars 76 forks source link

All sensors unavailable for one minute, same time each day. #126

Open Steve2017 opened 1 year ago

Steve2017 commented 1 year ago

I mentioned this in another issue and it has been suggested it should be separated.

The issue is that all sensors created by the integration become unavailable for a short period - usually a minute - at the same time each day. In my case, it is 23:00 local time (13:00 GMT). On some days I also experience a dropout at approximately 06:35, also for a very short period.

I am using v 0.2.10 of Enphase Envoy Envoy_2_Core_CustomTest which I gather is being folded into this repo. (Happened with earlier versions too)_ This is on Envoy S Metered with the Software Version: D7.6.175 (f79c8d) HA 2023.8.xx (was happening in July too)

Although the times might vary a little, I am not the only one reporting this behaviour: https://github.com/briancmpbll/home_assistant_custom_envoy/issues/122#issuecomment-1668339027 https://github.com/briancmpbll/home_assistant_custom_envoy/issues/122#issuecomment-1667567962 https://github.com/briancmpbll/home_assistant_custom_envoy/issues/122#issuecomment-1667803418

My dropouts are similar to the one in the first link above. Magnified view: Production-Consumption dropout All sensors, including micro-invertors demonstrate the same behaviour.

This is the log from an occurrence a few days ago, showing the system attempts a GET at 23:00:28 and times out about 30 seconds later. A new attempt then works:

> 2023-08-01 22:59:28.218 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 1221xxxxx data in 1.092 seconds (success: True)
> 2023-08-01 23:00:28.127 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: ******REDACTED TOKEN DATA*****
> 2023-08-01 23:00:28.128 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: ******REDACTED TOKEN DATA*****
> 2023-08-01 23:00:28.128 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-07-30 14:21:48
> 2023-08-01 23:00:28.128 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: Header:{'Authorization': 'Bearer ******REDACTED TOKEN DATA*****'}
> 2023-08-01 23:00:58.130 ERROR (MainThread) [custom_components.enphase_envoy] Timeout fetching envoy Envoy 1221xxxxxx data
> 2023-08-01 23:00:58.135 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 1221xxxxxxx data in 30.008 seconds (success: False)
> 2023-08-01 23:01:58.126 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: 

From those production and consumption sensors in the graph, I use a template sensor to calculate excess production (ie power available for export). That too drops out (with a slight lag) for a short period.

The issue for me is not so much the original sensor values and the template sensor, because they come back after a minute. The issue is the template sensor is the source for a Reimann Sum sensor calculating export energy. It also goes off-line, but does not return until solar production resumes at about 06:40. (Bear in mind this is at 11pm - so zero solar production)

Downstream from there I have a number of child sensors to give me data on daily energy exports, export earnings, net energy costs etc, etc. All of those become unavailable after the Reimann Sum sensor is unavailable for one minute. They return when solar production resumes.

The dropouts at 06:35 are happening prior to solar production resuming, so I see nothing different there.

I have a similar set-up for measuring grid import power - just with sensors reversed in the formula. It experiences a similar Template sensor and Reimann Sum dropout, but the Reimann Sum sensor comes back on - I am assuming because the template sensor is returning new and changing values.

I have tried a number of suggested solutions for the downstream issue, such as removing state_class and device_class entries from the Template Sensor YAML or using the "available" function in the sensor YAML. None of those things worked.

To try to isolate something, I set up the REST/RESTful sensor integration for the Envoy. It works without dropout. It uses the token not the Enphase User-Name and Password method to gain access to the Envoy.

Steve2017 commented 1 year ago

so the code change didn't do much then. It's probably your change to the Riemann logic

I haven't changed anything on the Riemann Sum entry. The only difference between the two is your code. However, if I look at a blow-up of one of the integration's sensors, the time-out is actually very slightly longer - running from 23:00:23 to 23:01:53

Screenshot 2023-08-17 220028

Previously the dropout ran closer to 60 seconds. On the REST sensors there is still no dropout that I can see.

catsmanac commented 1 year ago

the time-out is actually very slightly longer - running from 23:00:23 to 23:01:53

Yes that is caused by the 3 attempts of 30 second each before it would give up. Without the code change it would give up after 30 seconds.

So sounds like code change made it worse rather then improving it.

cddu33 commented 1 year ago

In another script that I had coded for Jeedom, I had a very short time-out because either the gateway responded under 10s or it did not respond and therefore the long timeouts were useless. on the other hand when it happened I had put a pause between the 2 requests of 30s to avoid messages in the logs and disconnections

Steve2017 commented 1 year ago

So sounds like code change made it worse rather then improving it.

It didn’t fix the cause (the dropout), but it did fix a symptom - the Riemann Sum sensors now restart.

If @cddu33 can bring in the other readings, it would enable Riemann Sum sensors to be avoided for all child sensors. That would allow you to revert the code change and the dropout could be ignored as a minor irritant. If the cause is in the Envoy firmware, we would just have to live with it.

catsmanac commented 1 year ago

Can't say I fully understand why the code change has this effect, @cddu33's experience give an indication and it makes sense to reduce time-out, but there's no real wait currently. Any change you can get me another debug view on the event?

The 0.0.14-beta1 pre-release on @briancmpbll's site currently does not include the retries as I was afraid of unexpected impact in other spots. I.e. if you have some typo during configuration it will take 1.5 min before the configuration screen comes back with an error.

Steve2017 commented 1 year ago

That’s a debug for v0.0.13+12-beta-01 at 23:00 local?

That’s what I am currently running.

cddu33 commented 1 year ago

Bug Firmware not script

catsmanac commented 1 year ago

That’s a debug for v0.0.13+12-beta-01 at 23:00 local?

Yes, that's the one with the retry

catsmanac commented 1 year ago

Bug Firmware not script

That's correct, we're just trying to minimize the effect of the firmware bug by retrying in the script.

cddu33 commented 1 year ago

For exemple, my other script for enphase v7: https://github.com/cddu33/enphase_secure/blob/beta/resources/enphasesecurd/enphasesecurd.py

it must be ensured that when the gateway has already sent data, the sensors are not put in error if it does not respond. unlike the case where it has never provided information and therefore it is just not available on this model. during the error what data did you have in the logs on these sensors?

Steve2017 commented 1 year ago

Debug Log for v0.0.13+12-beta-01 at 23:00 Riemann Sum sensors have dropped out and not come back this time - so all dependent sensors also unavailable.

2023-08-18 22:59:05.154 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy ---SERIAL NUMBER REDACTED--- data in 0.882 seconds (success: True)
2023-08-18 23:00:05.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-18 23:00:05.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-18 23:00:05.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-18 23:00:05.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-18 23:00:30.756 INFO (MainThread) [homeassistant.components.mqtt.discovery] Component has already been discovered: sensor Acurite-986-2F-43031 Acurite-986-2F-43031-B, sending update
2023-08-18 23:00:30.763 INFO (MainThread) [homeassistant.components.mqtt.discovery] Component has already been discovered: sensor Acurite-986-2F-43031 Acurite-986-2F-43031-F, sending update
2023-08-18 23:00:35.275 ERROR (MainThread) [custom_components.enphase_envoy] Timeout fetching envoy Envoy ---SERIAL NUMBER REDACTED--- data
2023-08-18 23:00:35.278 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy ---SERIAL NUMBER REDACTED--- data in 30.006 seconds (success: False)
2023-08-18 23:01:35.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-18 23:01:35.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-18 23:01:35.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-18 23:01:35.272 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-18 23:01:51.474 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Fetched (1) from https://192.168.1.144/production.json?details=1: <Response [200 OK]>: {"production":[{"type":"inverters","activeCount":21,"readingTime":0,"wNow":0,"whLifetime":836008},{"type":"eim","activeCount":1,"measurementType":"production","readingTime":1692363742,"wNow":1.389,"whLifetime":14198145.035,"varhLeadLifetime":0.063,"varhLagLifetime":1965319.385,"vahLifetime":14883748.819,"rmsCurrent":0.211,"rmsVoltage":235.412,"reactPwr":49.645,"apprntPwr":49.645,"pwrFactor":0.11,"whToday":18309.035,"whLastSevenDays":153044.035,"vahToday":19459.819,"varhLeadToday":0.063,"varhLagToday":2996.385,"lines":[{"wNow":1.389,"whLifetime":14198145.035,"varhLeadLifetime":0.063,"varhLagLifetime":1965319.385,"vahLifetime":14883748.819,"rmsCurrent":0.211,"rmsVoltage":235.412,"reactPwr":49.645,"apprntPwr":49.645,"pwrFactor":0.11,"whToday":18309.035,"whLastSevenDays":153044.035,"vahToday":19459.819,"varhLeadToday":0.063,"varhLagToday":2996.385}]}],"consumption":[{"type":"eim","activeCount":1,"measurementType":"total-consumption","readingTime":1692363742,"wNow":685.196,"whLifetime":14812861.573,"varhLeadLifetime":6386563.595,"varhLagLifetime":-1692492.338,"vahLifetime":17374174.599,"rmsCurrent":3.993,"rmsVoltage":235.586,"reactPwr":-402.102,"apprntPwr":940.599,"pwrFactor":0.73,"whToday":23086.573,"whLastSevenDays":161917.573,"vahToday":22954.599,"varhLeadToday":10457.595,"varhLagToday":0.0,"lines":[{"wNow":685.196,"whLifetime":14812861.573,"varhLeadLifetime":6386563.595,"varhLagLifetime":-1692492.338,"vahLifetime":17374174.599,"rmsCurrent":3.993,"rmsVoltage":235.586,"reactPwr":-402.102,"apprntPwr":940.599,"pwrFactor":0.73,"whToday":23086.573,"whLastSevenDays":161917.573,"vahToday":22954.599,"varhLeadToday":10457.595,"varhLagToday":0.0}]},{"type":"eim","activeCount":1,"measurementType":"net-consumption","readingTime":1692363742,"wNow":683.807,"whLifetime":615524.33,"varhLeadLifetime":6386563.658,"varhLagLifetime":272827.046,"vahLifetime":17374174.599,"rmsCurrent":3.782,"rmsVoltage":235.586,"reactPwr":-451.748,"apprntPwr":890.992,"pwrFactor":0.77,"whToday":0,"whLastSevenDays":0,"vahToday":0,"varhLeadToday":0,"varhLagToday":0,"lines":[{"wNow":683.807,"whLifetime":615524.33,"varhLeadLifetime":6386563.658,"varhLagLifetime":272827.046,"vahLifetime":17374174.599,"rmsCurrent":3.782,"rmsVoltage":235.586,"reactPwr":-451.748,"apprntPwr":890.992,"pwrFactor":0.77,"whToday":0,"whLastSevenDays":0,"vahToday":0,"varhLeadToday":0,"varhLagToday":0}]}],"storage":[{"type":"acb","activeCount":0,"readingTime":0,"wNow":0,"whNow":0,"state":"idle"}]}
2023-08-18 23:01:51.475 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/ivp/ensemble/inventory: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-18 23:01:51.733 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Fetched (1) from https://192.168.1.144/ivp/ensemble/inventory: <Response [200 OK]>: []
catsmanac commented 1 year ago

The timeout was never caught inside the collection function as exception. I've seen more reports of this occurring with httpx. So it's still the same issue, timeout after 30 seconds and then 1 minute later the next collection cycle runs and gets the data. By then it's 90 seconds past.

Riemann may show tomorrow again after getting non zero data again. Not sure if there's more options.

catsmanac commented 1 year ago

it must be ensured that when the gateway has already sent data, the sensors are not put in error if it does not respond.

Our code is called by HA every 60 seconds (or whatever freq is configured). If an error or raise occurs HA sets the data to unavailable and tries again next cycle.

There's various try -- except but not trigger and code exists the collection cycle.

catsmanac commented 1 year ago

Oh, think I may have found the underlying issue (again one, I thought I found the final one before :-( ).

The overall data collection loop is called within a timer of 30 seconds. If all of the data collections runs longer as 30 seconds it will timeout. We have 30 sec time-out on the get call as well, but the overall timer just ends the whole collection.

As @cddu33 mentions we probably can do a shorter timer on the gets, let retry happen and increase the overall timer a bit.

catsmanac commented 1 year ago

I've updated the timing around data collection, retries and get timeouts. Update from https://github.com/catsmanac/home_assistant_custom_envoy using HACS.

The overall timeout is now 40 sec, timeout on each get is 15 and 1 retry if timeout or error occurs. So at the 23:00 issue we should at least see it's trying again when debug is on.

afbeelding

Steve2017 commented 1 year ago

Update installed. Hopefully I’ll remember debug tonight.

catsmanac commented 1 year ago

Should be no problem if you forget, the error itself is written as a warning which will always be in the log. And that will tell us what happened as well.

Steve2017 commented 12 months ago

OK - good to know.

Not related to this issue but...... I have been doing more comparisons between the Enphase Integration and the RESTful data. The integration sensors for daily data - consumption and production - click over at 00:14, not right on midnight. (I was checking to see if it happened at 23:00) The Utility Meter sensors (daily cycle) I am using for RESTful, click over very close to midnight.

The daily energy data for consumption is slightly higher for RESTful - about 0.4kwh per day. That is only looking at 3 days, but should have accounted for the different reset times. I can only assume the 15 second RESTful scan makes a difference.

The data for daily solar production is either identical, or within 0.05kwh, with the Enphase Integration showing the higher reading where there is a difference. The difference might be in the RESTful sensor where it eliminates the phantom production that is actually the inverters' power draw after sundown.

Steve2017 commented 12 months ago
2023-08-19 22:59:46.118 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 122135007569 data in 4.871 seconds (success: True)
2023-08-19 23:00:29.988 INFO (MainThread) [homeassistant.components.mqtt.discovery] Component has already been discovered: sensor Acurite-986-2F-43031 Acurite-986-2F-43031-B, sending update
2023-08-19 23:00:29.994 INFO (MainThread) [homeassistant.components.mqtt.discovery] Component has already been discovered: sensor Acurite-986-2F-43031 Acurite-986-2F-43031-F, sending update
2023-08-19 23:00:46.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-19 23:00:46.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-19 23:00:46.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-19 23:00:46.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-19 23:01:01.708 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] Error in fetch_with_retry, try closing connection: 
2023-08-19 23:01:01.713 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #2: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-19 23:01:17.325 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] Error in fetch_with_retry, raising: 
2023-08-19 23:01:17.327 ERROR (MainThread) [custom_components.enphase_envoy] Error fetching envoy Envoy 122135007569 data: Error communicating with API: 
2023-08-19 23:01:17.327 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 122135007569 data in 31.079 seconds (success: False)
2023-08-19 23:02:17.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-19 23:02:17.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-19 23:02:17.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-19 23:02:17.248 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
catsmanac commented 12 months ago

Thanks @Steve2017, well it confirms the change in timing will allow for actual retry. Too bad the retry failed as well. It was done with the 2 attempts in 31 seconds, but neither the first attempt nor the retry spit out the actual error. Both took the allotted 15 secs. In between I tried closing the connection, but that clearly did fix it.

Next try will be @cddu33 recommended 30 sec wait in between 2 attempts.

catsmanac commented 12 months ago

Just released 0.0.14+2 in another attempt. Maybe went a little overboard by making all timing configurable...

By default this version will use a timeout of 12 seconds, then wait 30 seconds and try once more.

To fine tune to what may work use the settings option for the envoy. Be aware that the bottom item, Overall timeout, is the time after which the whole data collection is cancelled. I,e, if you increase the retry count or wait time between tries and exceed the overall timeout all data will be considered timed-out by this timer.

afbeelding

I simulated a timeout by unplugging my envoy from the network using an 8 sec timeout and 30 sec wait time, it's not an exact test for the problem but it will show what debug logs will show:

2023-08-20 11:55:25.097 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.3.112/api/v1/production: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-20 11:55:33.104 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP Timeout in fetch_with_retry, waiting a bit: 
2023-08-20 11:56:03.136 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #2: https://192.168.3.112/api/v1/production: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-20 11:56:11.144 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP Timeout in fetch_with_retry, raising: 
2023-08-20 11:56:11.149 ERROR (MainThread) [custom_components.enphase_envoy] Error fetching envoy Envoy 122302045041 data: Error communicating with API: 
2023-08-20 11:56:11.150 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 122302045041 data in 46.054 seconds (success: False)
Steve2017 commented 12 months ago

Installed - will leave the config at the default:

Time between entity updates [s]. 60 Timeout for getting single Envoy data page [s], minimum 5. 10 How many retries in getting single Envoy data page. minium 1. 1 Time between 2 retries to get single Envoy data page[s], minimum 0. 30 Overall Timeout on getting all Envoy data pages[s], minimum 30. 55

Steve2017 commented 12 months ago

Still off-line for 60 seconds. Riemann Sum sensors now unavailable. Lost cause???

2023-08-20 22:58:55.402 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy 122135007569 data in 1.216 seconds (success: True)
2023-08-20 22:59:55.186 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-20 22:59:55.187 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-20 22:59:55.187 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-20 22:59:55.187 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-20 23:00:07.479 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP Timeout in fetch_with_retry, waiting a bit: 
2023-08-20 23:00:37.488 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #2: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-20 23:00:50.180 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP Timeout in fetch_with_retry, raising: 
2023-08-20 23:00:50.181 ERROR (MainThread) [custom_components.enphase_envoy] Error fetching envoy Envoy REDACTED data: Error communicating with API: 
2023-08-20 23:00:50.181 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy REDACTED data in 54.995 seconds (success: False)
2023-08-20 23:01:50.186 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-20 23:01:50.186 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-20 23:01:50.187 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-20 23:01:50.187 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1: https://192.168.1.144/production.json?details=1: use owner token: True: Header: <Authorization header with Token hidden> 
2023-08-20 23:01:58.320 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Fetched (1 of 2) from https://192.168.1.144/production.json?details=1: <Response [200 OK]>: 
catsmanac commented 12 months ago

Lost cause???

Yes, I'm out of ideas now.

Whatever we try within a single collection cycle seems fruitless. Since it will get data on the next cycle the effect is limited but ugly with the Riemann until new values come in. I'm open to any good suggestion, but feel we exhausted our options.

Steve2017 commented 12 months ago

Close it as unresolved.

catsmanac commented 12 months ago

Any new ideas to try @Steve2017 ? What you might try is increase the wait time between tries. Make sure to increase the overall time accordingly.

Steve2017 commented 12 months ago

I can try, but I keep wondering if we are trying to solve a problem that is programmed into the Envoy. (When I say we I really mean you and others because my programming skills = 0.0)

Until we see whether it is possible for this integration to take data from "ivp/meters/readings" that thought will remain. I just wonder whether Enphase resets the processing unit compiling the "/production" data each night. It might have always done this, but this only became apparent when the HA developers changed something in the Riemann Sum code a few months ago?

catsmanac commented 12 months ago

I think you are right with your analysis. I was just checking if you had an Ah-moment as you re-opened the issue. For me no need to retry but if you had courage again I wanted to suggest some options.

I hope to add the ivp/meters/readings soon. Plan to release 0.0.15 today or tomorrow with a fix for battery issue in 0.0.14 and configurable timing (need it for slow Envoy as well, I probably change the defaults to more generic) and probably one other item. Then on with 0.0.16 with the ivp meters by @cddu33, hope to have test versions in a few days.

Steve2017 commented 12 months ago

No ah-ha. Just wanted to make sure people could see it is not a new problem, and that it is not critical. Their system will still work, but Riemann Sum sensors (& the child sensors) will go off-line until fresh data arrives.

catsmanac commented 12 months ago

@Steve2017, meters/readings is out for test. In HACS install from https://github.com/catsmanac/home_assistant_custom_envoy, enable beta versions and select V0.0.14-ImportExportMeters.

Readme there is updated with new entities.

this version also has configurable timeouts set to more relaxed defaults, but if you specified these it will use those found in your config.

Steve2017 commented 11 months ago

Hi @catsmanac I'm assuming from one of the other posts I should switch over to: https://github.com/briancmpbll/home_assistant_custom_envoy/releases/tag/v0.0.17

I might have a workaround on the 23:00 dropouts. Two days ago, I changed the scan period from 60 seconds to 150 seconds. I have just checked the history for the past two nights and the longer scan period seems to overcome the dropouts on all Envoy-supplied sensors. Instead of the dropout on production (which caused the Riemann Sum issue), it reverted to showing phantom production. Solar Production This is from two nights ago where I got two blips. Last night was one. Every other sensor I checked shows no dropouts at 23:00.

What I don't know is whether this avoids the Riemann Sum sensors issue, because I have disabled them. I have used the REST sensors with a utility meter sensor. I have just re-enabled a Rieman Sum test sensor so will see tonight.

Those two peaks last for about 2 minutes so a shorter scan period might also work.

Of course, the downside of the 150 second interval is you don't get as up-to-date net energy data.

Steve2017 commented 11 months ago

I might have spoken too soon. While I checked all the Envoy sourced sensors, the child sensors of those entities are still showing a dropout. The source sensors look okay, but not the ones downstream.

EDIT: Update at 23:25. The Riemann Sum sensor relying on solar power data did not become unavailable - as it would have in the past.. So a longer scan period overcomes the problem of dropouts and Riemann Sum. I’ll try with a 120 scan interval and see how that goes tomorrow.

catsmanac commented 11 months ago

Hi @Steve2017, it remains a puzzle what really is going on. The blips make no sense either. The 150 sec could avoid that danger minute completely, but then why the blip. And the first blip seems at 23:04 why later than the usual danger minute. Are those values actually coming from the Envoy? That would require a debug log to inspect. IF the scan interval is 150, then why are the blips 2 min wide and 1 min in between?

The v0.0.17 version has the import/export added from /ivp/meteres/readings. But as far as timimg is concerned it has the same options as v0.0.15 and v0.0.16.

Be aware that HA Core will be releasing their updated version in September and it uses a rewritten communication that may or may not help with this too. The V0.0.17 import export is not part of HA core yet and when it gets added in a future version it may differ from this one and may cause history import/export to start fresh on new entities/variables. So depending how important that continuation is you may hold back on installing. On the other getting some data to see what it can bring can be considered as better then no data at all. Just that you are aware of this.

Steve2017 commented 11 months ago

Are those values actually coming from the Envoy?

I’m not sure if your question is rhetorical, but that is one of the native sensors in the integration. I’ll run debug tomorrow. I was surprised at the two.

I got a single blip from the reading half an hour ago. The 1-watt equates with the normal power drain I see on the Envoy home page after sunset (hence phantom production)

Steve2017 commented 11 months ago

Back to the two minute scan and I can see a dropout and the Riemann Sum sensor is unavailable.

2023-08-31 22:58:29.334 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy  -redacted-  data in 1.965 seconds (success: True)

2023-08-31 23:00:29.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-31 23:00:29.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-31 23:00:29.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-31 23:00:29.371 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1 of 2: https://192.168.1.144/production.json?details=1: use token: True: Header: <Token hidden>  Timeout: 30 Holdoff: 0
2023-08-31 23:00:59.660 WARNING (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP Timeout in fetch_with_retry, waiting 0 sec: 
2023-08-31 23:00:59.665 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #2 of 2: https://192.168.1.144/production.json?details=1: use token: True: Header: <Token hidden>  Timeout: 30 Holdoff: 0
2023-08-31 23:01:24.372 ERROR (MainThread) [custom_components.enphase_envoy] Timeout fetching envoy Envoy  -redacted-  data
2023-08-31 23:01:24.372 DEBUG (MainThread) [custom_components.enphase_envoy] Finished fetching envoy Envoy  -redacted-  data in 55.002 seconds (success: False)

2023-08-31 23:03:24.369 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Checking Token value: yJraWQiOi (Only first 10 characters shown)
2023-08-31 23:03:24.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token is populated: yJraWQiOi (Only first 10 characters shown)
2023-08-31 23:03:24.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Token expires at: 2024-08-10 16:20:04
2023-08-31 23:03:24.370 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] HTTP GET Attempt #1 of 2: https://192.168.1.144/production.json?details=1: use token: True: Header: <Token hidden>  Timeout: 30 Holdoff: 0
2023-08-31 23:03:24.839 DEBUG (MainThread) [custom_components.enphase_envoy.envoy_reader] Fetched (1 of 2) in 0.5 sec from https://192.168.1.144/production.json?details=1: <Response [200 OK]>: {"production":

I'm giving up on this one.

catsmanac commented 11 months ago

Fully understand, this remains mind-blowing... .

Only thing currently I can think of to determine if it's the overall Envoy or just the production page is to swap the order of the collection in envoy_reader.py

from:

    async def _update_from_pc_endpoint(self):
        """Update from PC endpoint."""
        await self._update_endpoint(
            "endpoint_production_json_results", ENDPOINT_URL_PRODUCTION_JSON
        )
        await self._update_endpoint(
            "endpoint_ensemble_json_results", ENDPOINT_URL_ENSEMBLE_INVENTORY
        )

to:

    async def _update_from_pc_endpoint(self):
        """Update from PC endpoint."""
        await self._update_endpoint(
            "endpoint_ensemble_json_results", ENDPOINT_URL_ENSEMBLE_INVENTORY
        )
        await self._update_endpoint(
            "endpoint_production_json_results", ENDPOINT_URL_PRODUCTION_JSON
        )

and see if it bumps on ensemble inventory or still on production.

But I understand it if abandon all attempts for now, I have given up too before.

catsmanac commented 11 months ago

@Steve2017 with installing v0.0.18 you now can opt to not use the production page anymore using the configure option on the envoy integration page and reload the envoy. It will make today's and last 7 days numbers unavailable, but might solve the daily 11 PM dropout by avoiding the production page.

afbeelding

Steve2017 commented 11 months ago

Tks. I have v 0.0.18 installed using default settings - and I still got the 23:00 dropout. I’ve ticked the check-box and reloaded so I’ll see how it goes tonight.

If I recall V 0.0.17 had the daily export and import data? (Called “index”) I liked that.

catsmanac commented 11 months ago

In v0.0.18 the import and export is renamed to 'Lifetime Net Energy Consumption' and 'Lifetime Net Energy Production'. You can use these in the energy dashboard for grid import and export.

Steve2017 commented 11 months ago

Tks

They are not the most intuitive labels (at least for me). Lifetime Net Energy Production is not producing any data for me - it is flat-lined at zero. I am getting data for the Lifetime Net Energy Consumption so it will work with a utility meter.

I’ve tried with the “do Not Use Production” checkbox ticked, but I found the loss of today’s consumption and production is inconvenient, so I’ll revert to using “Production”. That means the above sensors will no be available, so not an issue for me.

catsmanac commented 11 months ago

Labels names are indeed not always intuitive the same way for everyone I've learned with this one. HA allows you to assign your own labels that will show.

The lifetime net energy production and consumption come from the Consumption CT in Solar included mode. If you don't have it installed it will be flat I guess, or even bogus data. If you have a consumption CT installed then we may have an issue unless all solar production is consumed directly by the house.

The net production and consumption number are available also when do not use production is unchecked, but in your case with same result, flat-line zero. Only data used from the production page is Today's and Last 7 days, all other data now comes from the ivp/meters/reports page, irregardless of the checkbox.

catsmanac commented 11 months ago

There's more reports of net production being zero, seems like newer firmware behaves different as one of the original author.

Steve2017 commented 10 months ago

It might depend how the CT clamps are configured. I have a CT clamp on the production and that is straight forward (aside from the phantom production after sundown).

The consumption CT measures load with solar production as below:

image So anything else requires some calculations from the Envoy. The alternative is load-only. image

catsmanac commented 10 months ago

Right, that's correct. The issue was reported for a metered Envoy with a production CT and a consumption CT with solar included. The ivp/meters/readings page reported as before, but the ivp/meters/reports page changed it behavior over firmware upgrade.

catsmanac commented 10 months ago

I've updated the code to get the net-production and net-consumption data from ivp/meters/readings so they show both as positive numbers to use with energy dashboard and not as negative net consumption. #164. No need to use that do Not Use Production option for this.