ksheumaker / homeassistant-apsystems_ecur

Home Assistant custom component for local querying of APSystems ECU-R Solar System
Apache License 2.0
178 stars 43 forks source link

Issue: ECU Query' data=b #42

Closed andrebbruno closed 2 years ago

andrebbruno commented 2 years ago

Hello - any idea what could be wrong? The integration has been working for a few days just fine after installation.

thanks!

This error originated from a custom integration.

Logger: custom_components.apsystems_ecur Source: custom_components/apsystems_ecur/init.py:53 Integration: APSystems PV solar ECU-R (documentation) First occurred: February 17, 2022, 11:45:09 PM (125 occurrences) Last logged: 10:05:10 AM

Unexpected error fetching apsystems_ecur data: Error using cached data for more than 5 times. Traceback (most recent call last): File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 228, in check_ecu_checksum checksum = int(data[5:9]) ValueError: invalid literal for int() with base 10: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/config/custom_components/apsystems_ecur/init.py", line 77, in update data = await self.ecu.async_query_ecu() File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 112, in async_query_ecu self.process_ecu_data() File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 254, in process_ecu_data self.check_ecu_checksum(data, "ECU Query") File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 231, in check_ecu_checksum raise APSystemsInvalidData(f"Error getting checksum int from '{cmd}' data={debugdata}") custom_components.apsystems_ecur.APSystemsECUR.APSystemsInvalidData: Error getting checksum int from 'ECU Query' data=b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 187, in _async_refresh self.data = await self._async_update_data() File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 147, in _async_update_data return await self.update_method() File "/config/custom_components/apsystems_ecur/init.py", line 96, in update data = self.use_cached_data(msg) File "/config/custom_components/apsystems_ecur/init.py", line 53, in use_cached_data raise Exception(f"Error using cached data for more than {self.cache_max} times.") Exception: Error using cached data for more than 5 times.

HAEdwin commented 2 years ago
File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 228, in check_ecu_checksum
checksum = int(data[5:9])
ValueError: invalid literal for int() with base 10: b''

This is most certainly a received message from the ECU which was not expected and has a different format. Nothing wrong with that received message but it should be ignored because it cannot be interpreted. There must be a way around that preventing that this can happen. I'd suggest dropping this instance and use the cache. Needs some adjustments in the code.

Rest of the errors is the cache code used, I'd suggest to drop the cache count and just cache when needed.

So in this case it is bad luck that a message arrived from the ECU that should not have been interpreted. It probably does not even contain END\n.

Did the ECU hung after this? What firmware do you have?

andrebbruno commented 2 years ago
File "/config/custom_components/apsystems_ecur/APSystemsECUR.py", line 228, in check_ecu_checksum
checksum = int(data[5:9])
ValueError: invalid literal for int() with base 10: b''

This is most certainly a received message from the ECU which was not expected and has a different format. Nothing wrong with that received message but it should be ignored because it cannot be interpreted. There must be a way around that preventing that this can happen. I'd suggest dropping this instance and use the cache. Needs some adjustments in the code.

Rest of the errors is the cache code used, I'd suggest to drop the cache count and just cache when needed.

So in this case it is bad luck that a message arrived from the ECU that should not have been interpreted. It probably does not even contain END\n.

Did the ECU hung after this? What firmware do you have?

The system updated itself today and now HA cannot read the data, showing me as unavailable.

image

ksheumaker commented 2 years ago

A bit of background. Once we get bad data from the ECU, which is what the data=b'' errors indicate, we pull from a cache of previous reads. We pull and report this cache 5 times in case the device is doing something and can't respond for a few seconds. At the default polling interval of 300 seconds that would give the ECU 25 minutes to start working again. If it doesn't then the sensors will all go unavailable since it can't get any data.

It will continue to try every 300 seconds, but if the ECU doesn't recover and start sending good data, it will stay unavailable. I don't think just pulling from cache forever is a correct answer, as it could go on for days if the ECU is in a bad state.

My suggestion is to try power cycling the ECU and see if it fixes anything. The 1.2.10 branch has different socket code that might be more stable, but we are still trying to figure that out. If you mark to show you beta in HACS you can try that one.

andrebbruno commented 2 years ago

A bit of background. Once we get bad data from the ECU, which is what the data=b'' errors indicate, we pull from a cache of previous reads. We pull and report this cache 5 times in case the device is doing something and can't respond for a few seconds. At the default polling interval of 300 seconds that would give the ECU 25 minutes to start working again. If it doesn't then the sensors will all go unavailable since it can't get any data.

It will continue to try every 300 seconds, but if the ECU doesn't recover and start sending good data, it will stay unavailable. I don't think just pulling from cache forever is a correct answer, as it could go on for days if the ECU is in a bad state.

My suggestion is to try power cycling the ECU and see if it fixes anything. The 1.2.10 branch has different socket code that might be more stable, but we are still trying to figure that out. If you mark to show you beta in HACS you can try that one.

power cycling did the trick - I was afraid today´s firmware upgrade was the main issue.

thanks!

ksheumaker commented 2 years ago

For my reference what firmware does the ecu show in the integration?

andrebbruno commented 2 years ago

For my reference what firmware does the ecu show in the integration?

image

tv3 commented 2 years ago

@andrebbruno Maybe I can help. I have adapted Kyle's integration to work with nice the ecu_r_pro 2.0x firmware. It is still a beta version and I need to cleanup and work with Kyle to merge. But you can try my version at

https://github.com/tv3/homeassistant-apsystems_ecur

Copy the APSystemECU.py into your HA setup. Edit init.py and replace async_query_ecu() call with http_query__ecu() Restart HA And do a reboot of ecu itself to give it a clean state.

ksheumaker commented 2 years ago

Maybe I can help. I have adapted Kyle's integration to work with nice the ecu_r_pro 2.0x firmware. It is still a beta version and I need to cleanup and work with Kyle to merge. But you can try my version at

https://github.com/tv3/homeassistant-apsystems_ecur

@tv3 This is great, I'll look at changing the config when you add the integration to do TCP or HTTP. Can you provide me with the output of these urls from your ECU somewhere so I can take a look?

        self.url_old_power_graph = "http://" + ipaddr + "/index.php/realtimedata/old_power_graph"
        self.url_realtimedata="http://" + ipaddr + "/index.php/realtimedata"

I'd like to use it for testing, since my ECU doesn't have this web interface to test against. Can anyone confirm if the web UI is the same for the ECU-C? I know it has a web-interface as well.

andrebbruno commented 2 years ago

Maybe I can help. I have adapted Kyle's integration to work with nice the ecu_r_pro 2.0x firmware. It is still a beta version and I need to cleanup and work with Kyle to merge. But you can try my version at https://github.com/tv3/homeassistant-apsystems_ecur

@tv3 This is great, I'll look at changing the config when you add the integration to do TCP or HTTP. Can you provide me with the output of these urls from your ECU somewhere so I can take a look?

        self.url_old_power_graph = "http://" + ipaddr + "/index.php/realtimedata/old_power_graph"
        self.url_realtimedata="http://" + ipaddr + "/index.php/realtimedata"

I'd like to use it for testing, since my ECU doesn't have this web interface to test against. Can anyone confirm if the web UI is the same for the ECU-C? I know it has a web-interface as well.

image This is the interface for mine

andrebbruno commented 2 years ago

@andrebbruno Maybe I can help. I have adapted Kyle's integration to work with nice the ecu_r_pro 2.0x firmware. It is still a beta version and I need to cleanup and work with Kyle to merge. But you can try my version at

https://github.com/tv3/homeassistant-apsystems_ecur

Copy the APSystemECU.py into your HA setup. Edit init.py and replace async_query_ecu() call with http_query__ecu() Restart HA And do a reboot of ecu itself to give it a clean state.

thanhk you ... it´s a bit challenging for me to be honest. Things are working now with the power cycle of the ECU ... but I´ll promisse to dig a bit more in case of issues.

tv3 commented 2 years ago

@andrebbruno No worries. We'll get a proper version available at some point. @ksheumaker I'll grab some of the raw html and api responses somewhere this weekend for you to look at.

andrebbruno commented 2 years ago

Not sure what happened with today´s update, but my energy dashboard now shows negative value for my farm´s entire life production :( - any way to remove this data? It screwed up my entire database :(

image

andrebbruno commented 2 years ago

@andrebbruno Maybe I can help. I have adapted Kyle's integration to work with nice the ecu_r_pro 2.0x firmware. It is still a beta version and I need to cleanup and work with Kyle to merge. But you can try my version at

https://github.com/tv3/homeassistant-apsystems_ecur

Copy the APSystemECU.py into your HA setup. Edit init.py and replace async_query_ecu() call with http_query__ecu() Restart HA And do a reboot of ecu itself to give it a clean state.

So the Integration stopped working since yesterday´s release and I tried copying the .py file into the current integration and edit the init.py file, but I could not find this line "async_query_ecu()" anywhere ... should I find exactly this line or something like that?

Here´s the issue that I have now ... power cycle didn´t do the trick this time:

image

tv3 commented 2 years ago

image

andrebbruno commented 2 years ago

sync_query_ecu() call with http_query__ecu() Resta

mine does not show "async def" in line 60 for some reason. Should I change it exactly as yours? Should I remove the integration and add yours instead of simply copying the APSystemECU.py file into the integration folder? Also I dont see "await self.ecu etc , I only see "data = self.ecu.query_ecu()"

image

tv3 commented 2 years ago

replace self.ecu.query_ecu() with self.ecu.http_query_ecu()

forget the linenumbers.

andrebbruno commented 2 years ago

self.ecu.http_query_ecu()

Sorry that didn´t work :(

image

Anything else I could try? thanks for your patience :(

HAEdwin commented 2 years ago

Line 7 imports functions from the APSystemsSocket part. Same should be done for the http_query_ecu part else you can't call the function. http_query_ecu is now expected in APSystemsSocket but it's not there. Keep line 7 in place and add a line to import the functions from the html part so: from .APSystemsHTTP import http_query_ecu, etcetera

tv3 commented 2 years ago

@andrebbruno I am not sure which version of the integration you're using. Seems it's different from mine.

Copy all code and other files from my repo (like __init etc) to your HA folder. Then implement the change replacing async_query_ecu and restart.

andrebbruno commented 2 years ago

@andrebbruno I am not sure which version of the integration you're using. Seems it's different from mine.

Copy all code and other files from my repo (like __init etc) to your HA folder. Then implement the change replacing async_query_ecu and restart.

by simply copying the code from your repo it worked ... go figure

dclobato commented 2 years ago

I'd like to use it for testing, since my ECU doesn't have this web interface to test against. Can anyone confirm if the web UI is the same for the ECU-C? I know it has a web-interface as well.

Yes. My ECU has the HTTP interface, but it is a ECU-R, and not ECU-C. Running ECU_R_PRO_2.0.5017

tv3 commented 2 years ago

@dclobato Then you could implement my solution. Copy the files in my repository to your HA aps installation.

https://github.com/tv3/homeassistant-apsystems_ecur/tree/main/custom_components/apsystems_ecur

Change init.py : replace (line 78) data = await self.ecu.async_query_ecu() data = await self.ecu.http_query_ecu()

dclobato commented 2 years ago

@dclobato Then you could implement my solution. Copy the files in my repository to your HA aps installation. https://github.com/tv3/homeassistant-apsystems_ecur/tree/main/custom_components/apsystems_ecur Change init.py : replace (line 78) data = await self.ecu.async_query_ecu() data = await self.ecu.http_query_ecu()

I did. It's "half working".

The ECU data is ok, but individual inverters are all zero :-/ Log shows several messages like this one

2022-02-24 16:29:26 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 134, in _handle_refresh_interval
    await self._async_refresh(log_failures=True, scheduled=True)
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 265, in _async_refresh
    update_callback()
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 325, in _handle_coordinator_update
    self.async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 539, in async_write_ha_state
    self._async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 572, in _async_write_ha_state
    state = self._stringify_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 545, in _stringify_state
    if (state := self.state) is None:
  File "/config/custom_components/apsystems_ecur/sensor.py", line 131, in state
    return self.coordinator.data.get("inverters", {}).get(self._uid, {}).get("voltage", [])[0]
IndexError: list index out of range
2022-02-24 16:34:28 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 134, in _handle_refresh_interval
    await self._async_refresh(log_failures=True, scheduled=True)
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 265, in _async_refresh
    update_callback()
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 325, in _handle_coordinator_update
    self.async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 539, in async_write_ha_state
    self._async_write_ha_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 572, in _async_write_ha_state
    state = self._stringify_state()
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 545, in _stringify_state
    if (state := self.state) is None:
  File "/config/custom_components/apsystems_ecur/sensor.py", line 131, in state
    return self.coordinator.data.get("inverters", {}).get(self._uid, {}).get("voltage", [])[0]
IndexError: list index out of range
2022-02-24 (1) 2022-02-24
tv3 commented 2 years ago

Those errors are in code I did not change as far as I can see on a first look. Can you supply some info on the inverters (model, amount) and screenshot of inverter webpage?

http://ecu_ip/index.php/realtimedata

dclobato commented 2 years ago

Those errors are in code I did not change as far as I can see on a first look. Can you supply some info on the inverters (model, amount) and screenshot of inverter webpage?

There are three QS1-A. Here is the screenshot

Web capture_24-2-2022_17297_10 0 1 28

HAEdwin commented 2 years ago

@andrebbruno Does the 1.2.12 release on HACS solve your ECU Query' data=b issue? If so this issue can be closed.

andrebbruno commented 2 years ago

@andrebbruno Does the 1.2.12 release on HACS solve your ECU Query' data=b issue? If so this issue can be closed.

Things are looking good, thanks for checking! The only thing that still happens is the ECU status becomes unavailable even with the automation to query on/off. So I´m manually shutting down the ECU and restarting HA for things to come back. Not sure why I need to force restart HA, it would be great if it could catch things up without that.

HAEdwin commented 2 years ago

It is known that the ECU does some maintenance to check if EMA holds all the data and stay up-to-date with firmware updates. In my case it does this early in the morning around 3 o'clock. Query fails five times but recovers after maintenance and carries on for the rest of the 24 hours until next maintenance. The ECU might sometimes become unavailable for responses but will respond again in the next interval.

These issues can be ignored and a reset should not be needed so I let the integration query 24/7 without any problem. image

So when does the ECU become unavailable and how long does it run before it becomes unavailable and the only solution is a reset? Btw you can add a smartplug to automate the hard-reset of the ECU.

andrebbruno commented 2 years ago

It is known that the ECU does some maintenance to check if EMA holds all the data and stay up-to-date with firmware updates. In my case it does this early in the morning around 3 o'clock. Query fails five times but recovers after maintenance and carries on for the rest of the 24 hours until next maintenance. The ECU might sometimes become unavailable for responses but will respond again in the next interval.

These issues can be ignored and a reset should not be needed so I let the integration query 24/7 without any problem. image

So when does the ECU become unavailable and how long does it run before it becomes unavailable and the only solution is a reset? Btw you can add a smartplug to automate the hard-reset of the ECU.

Today mine became unavailable after 8am for some reason, the only solution is to power cycle the ECU and restart home assistant. I will have a smartplug for that, but it would be really nice if HA wouldnt need to be restarted.

thanks :)

HAEdwin commented 2 years ago

Ok, let's see if there's a pattern and wait if others with ECU-R pro firmware have the same problem. I think in the mean while this issue can be closed since it was solved with the 1.2.12 release. You can open a new issue, might as well include the tv3 option but "one fits all" would be nicer.

HAEdwin commented 2 years ago

@andrebbruno Try if #71 solves your problem, please let me know if it does so I can fix it and do a pull request.

andrebbruno commented 2 years ago

@andrebbruno Try if #71 solves your problem, please let me know if it does so I can fix it and do a pull request.

hi! I had the ECU becoming unavailable yesterday, so after the latest update I did rebooted it and HA caught it without a restart.... that´s very good! Hope it doesnt get unavailable too frequently, but not having to reboot HA is a great thing!

HAEdwin commented 2 years ago

Thanks for the comment and feedback. Actually a complete and time consuming reboot of the host is not necessary, a restart will do just fine (Configuration>Settings>Restart) in case you didn't allready use this method.

HAEdwin commented 2 years ago

@andrebbruno I hope the 1.2.13 release will fix the issue where the ECU-R becomes unavailable after a while. This release includes #71 With many thanks to @ksheumaker for updating.

HAEdwin commented 2 years ago

@andrebbruno I'd recommend my repository at https://github.com/HAEdwin/homeassistant-apsystems_ecur and use that version of APSystemsSocket.py to make sure the initial issue is solved.

Unfortunately, the later ECU-R models are less suitable for continuous querying, that also applies to the ECU-C. After many attempts and without being able to troubleshoot myself because of the lack of an ECU-R/ECU-C with SunSpec logo I gave up, it is a firmware issue. Please consider closing this issue as I cannot moderate it.

HAEdwin commented 2 years ago

@andrebbruno The issue ECU Query' data=b should be solved with the latest release. Christiaan is working on a method to software reset the ECU becomes unavailable. If it works I might add it to the integration. If you have any new issues please open a new issue.