ksheumaker / homeassistant-apsystems_ecur

Home Assistant custom component for local querying of APSystems ECU-R Solar System
Apache License 2.0
166 stars 42 forks source link

1.2.30b breaks integration #213

Closed marcusds closed 6 months ago

marcusds commented 7 months ago

After updating to 1.30.0 today I am now getting the error: Using cached data from last successful communication from ECU. Invalid data error: ECU returned 0 for lifetime energy, this is either a glitch from the ECU or a brand new installed ECU. Raw Data=b''

Downgrading to 1.29.0 resolved the issue, though took more restarts than I thought it would but maybe just a HACS quirk.

There is nothing else interesting in the logs, so I am not sure what other info to give.

Firmware: ECU_R_PRO_2.1.15

coldenvy commented 7 months ago

same here ran into the same issue thanks for confirming

HAEdwin commented 7 months ago

I removed version 1.2.30 right now but kept the beta version. Is you see an error in the log please put the integration into debug mode. That way I'm better able to see what's going wrong. This issue but also the firmware issue with firmware 1.3.6C will keep me busy for a while :( As allways, things are working fine for my good old ECU-R.

Robsonator commented 7 months ago

Same problem for me. In Debug Mode i got the following message: 2023-11-25 16:33:11.263 WARNING (SyncWorker_1) [custom_components.apsystems_ecur] Using cached data from last successful communication from ECU. Invalid data error: ECU returned 0 for lifetime energy, this is either a glitch from the ECU or a brand new installed ECU. Raw Data=b''

niouniou49 commented 7 months ago

same here with 1.2.30 the communication to ECU-R brakes after few minutes; rolled back to .29 and will see... some ideas

HAEdwin commented 7 months ago

Yesterday I released an updated version of 1.2.30 with fewer changes left in it like the keepalive. If you want to give this version a try? Please provide enough info (debug log, firmware version). Thank you for testing!

niouniou49 commented 7 months ago

same problem continues.... after few hours with the last release, it breaks and i need to restart the ECU electrically :-) i roll backed to .29 for fews hours and will let u know my ecu -r is from 2020 and start with 2160xxxxx

CarlosGS commented 7 months ago

Same problem, here's some debugging info:

However, it did stay connected to the internet, and it still sent data to the official APsystems sever/app. This points to the ECU entering some odd state where it just can't accept connections in port 8899.

Maybe the new code is leaving TCP connections open, or doing something different that could be messing with the ECU's TCP server? :thinking:

Here's the diff in case someone can give it a try https://github.com/ksheumaker/homeassistant-apsystems_ecur/compare/v1.2.29...v1.2.30#files_bucket At first I only see there's a new keepalive flag for the socket, maybe that could fill up the ECU's TCP clients queue.

HAEdwin commented 7 months ago

Thanks @CarlosGS you might be on the right track. If you are able to modify the file like so, you might have ruled out the error. I'm not able to reproduce the error, every ECU model/firmware seems to act differently image Ultimately, the intention was to use keepalive to eliminate opening and closing between queries. If you are able to troubleshoot you could try the combination of using keepalive and a single open and close command if you understand what I mean.

niouniou49 commented 7 months ago

i commented the line with keep alive as in your example and will let you know if it works correctly after that

TheRealCryss commented 7 months ago

Commenting out that line didn't work, for me at least.

I get a 'host unreachable' message when trying the terminal test connect with 'nc'. The ip is correct and I see it in my router. The router was working for an hour early the day and it received some power on the panels. It is also not sending any data anymore to the cloud service. Somehow it got stuck.

HAEdwin commented 7 months ago

@TheRealCryss Have you just installed the integration (new user) or allready a user? Please add the type of ECU and firmware version.

TheRealCryss commented 7 months ago

ECU-B (2163000010433 Storage No). Can't provide the firmware version right now as I'm not at home until Sunday evening.

What do you mean with new user or already user? I've reinstalled the integration several times today after it stopped working early this morning.

I had the issue yesterday already when I set it up the first time and when it stopped working after some hours. I fixed it temporarily (until today as it seems), by a mixture of reinstalling the integration and rebooting the ECU. But I don't have physical access right now. Can I provide you any log files (I guess they may be lost due to reinstallation).

HAEdwin commented 7 months ago

@TheRealCryss If you've reinstalled the software version 1.2.30 and have applied the possible patch at line 88 you will have to restart HA. Not sure if you've done that.

TheRealCryss commented 7 months ago

Hm. I did. I'll try again. Still no connection.

Bildschirmfoto 2023-12-01 um 16 15 13

`Logger: custom_components.apsystems_ecur.config_flow Source: custom_components/apsystems_ecur/config_flow.py:30 Integration: APSystems PV solar ECU (documentation, issues) First occurred: 16:06:17 (8 occurrences) Last logged: 16:10:47

APSystemsInvalidData exception: timed out
APSystemsInvalidData exception: [Errno 113] Host is unreachable

Traceback (most recent call last): File "/config/custom_components/apsystems_ecur/APSystemsSocket.py", line 90, in open_socket self.sock.connect((self.ipaddr, self.port)) TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/config/custom_components/apsystems_ecur/config_flow.py", line 30, in async_step_user test_query = await self.hass.async_add_executor_job(ap_ecu.query_ecu) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/config/custom_components/apsystems_ecur/APSystemsSocket.py", line 97, in query_ecu self.open_socket() File "/config/custom_components/apsystems_ecur/APSystemsSocket.py", line 93, in open_socket raise APSystemsInvalidData(err) custom_components.apsystems_ecur.APSystemsSocket.APSystemsInvalidData: timed out`

TheRealCryss commented 7 months ago

Update: I got the integration up again (without switching the ECU on and off again).

The reason was poor or not good enough Wifi connectivity it seems. I checked my Fritzbox router and saw that 3 of the 4 Tasmota WiFi power plugs also weren't connected to the network and that the 2.4Ghz spectrum was busy between 90 and 100%! The ECU had a good downstream of about 36Mbit but only 1MBit upstream. So I guess that was the reason why I couldn't test the connection with 'nc' even while the ECU was connected to my Wifi.

Now I must see if the connection remains stable.

Firmware for my ECU-B is: ECU_B_1.2.27B

niouniou49 commented 7 months ago

1.2.30 even with patch line 88... failed back to 1.2.29 for today. ECU-R-EU 216xxxx version ECU_R_1.2.25B can i update the ECU? new firmware version?

CarlosGS commented 7 months ago

Thanks for giving it a try! Unfortunately our setup rolling back to version 1.2.29, though it worked yesterday, didn't work today. It shows cached data since 2AM.

Testing the nc command again shows Connection refused. I've physically restarted the ECU and it hasn't changed. Still, the ECU is connected to the internet (both lights ON) and the APSystem app works. Strange.

HAEdwin commented 7 months ago

@CarlosGS Just to make sure, did you assign a fixed IP-address to the ECU? Could it be that the DHCP server assigned a new IP-address making it unreachable on the old address? Are you able to ping the ECU?

niouniou49 commented 7 months ago

after i rolled back to 1.2.29 - all is working back again ... after 26h, no wifi network interruption, all data went on with the sun coming back. i can ping the fixed ip and the result of the command 'nc -v 10.xxxxxx 8899' is 'Connection to 10.xxxxxx port 8899 [tcp/*] succeeded! made a script to switch off the ecu queries at 6pm and switch on at 8:30 am, then data came back . there is something to investigate in the 1.2.30 release that broke the integration withECU-R-EU 216xxxx version.

CarlosGS commented 7 months ago

@HAEdwin Yep, same IP. Ping works, but refuses connection on port 8899. I've rebooted HASS & the ECU and still does it.

@niouniou49 In case wifi affects this, is your ECU-R connected by wifi, or by cable? Mine is ECU-B (wifi).

niouniou49 commented 7 months ago

mine is also ECU_B via WiFi and works now very well as roll backed to 1.2.29 since 48h. not tried cable but i think cable doesn't work with ecu integration just in installed the 1.2.30b and will let you know

Screenshot 2023-12-03 at 15 09 46
niouniou49 commented 7 months ago

last version 1.2.30b still work fine after 36 hours.... connected via wifi to ECU-R-EU 216xxxx version

CarlosGS commented 6 months ago

This is our picture so far with 1.2.29. Maybe it was a coincidence when we downgraded ~nov 26th :confused: image (Gray=Working well, Yellow=Cached, Dark=No data)

@niouniou49 If it says ECU-R-EU then it would be ECU-R? Our label shows ECU-B-EU 216xxxx, also connected via wifi.

niouflex49 commented 6 months ago

Hi some news 1.2.30b broke the connection again Rolled back to .29

HAEdwin commented 6 months ago

@niouflex49 @TheRealCryss @Robsonator @coldenvy @marcusds I updated the release, hope things work fine now - if not please let me know with a log extract while debug is on. Thank you all for testing 🙏

HAEdwin commented 6 months ago

In v1.2.30b I was a bit conservative with the changes by rolling them back. It teaches me to release and test carefully and to take sufficient time for experiences from the user group. For now I close this issue. Feel free to start a new issue based on your experiences with the last 1.2.30 release. (v1.2.30 is the official release of v1.2.30b for the non-beta testers among the users)

marcusds commented 6 months ago

So far it seems to be working fine.