svalouch / python-rctclient

Python client for RCTs Serial Communication Protocol
https://rctclient.readthedocs.io
GNU General Public License v3.0
46 stars 27 forks source link

rctclient.exceptions.InvalidCommand: 43 is not a valid Command #16

Open stephanritscher opened 2 years ago

stephanritscher commented 2 years ago

Hi,

I exported some data from my RCT power using timeseries2csv.py and encountered the following error sporadically:

Traceback (most recent call last):
  File "/home/stephan/python-rctclient/tools/./timeseries2csv.py", line 333, in <module>
    timeseries2csv()
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.10/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/stephan/python-rctclient/tools/./timeseries2csv.py", line 226, in timeseries2csv
    rframe.consume(buf)
  File "/home/stephan/.local/lib/python3.10/site-packages/rctclient/frame.py", line 427, in consume
    raise InvalidCommand(str(exc), cmd, i) from exc
rctclient.exceptions.InvalidCommand: 43 is not a valid Command

I noticed that I received Got unexpected frame oid ... much more often when on I open the RCT app so it might be that handling of multiple connections is limited on the RCT power generator side. In parallel, also an instance of https://github.com/lauff/ioBroker.rct is running, so the invalid command might also be some kind of interaction. The RCT power has the newest firmware installed.

Best regards, Stephan

weltenwort commented 2 years ago

The multiplexing capabilities of the inverter firmware seems very limited indeed. It seems every client is receiving the responses to every other connected client. While writing https://github.com/weltenwort/home-assistant-rct-power-integration based on this awesome library I had to work around that as well.

svalouch commented 1 year ago

Yea, make sure to have the latest firmware. Older ones (especially around 2020) were really broken and could send you one client's request in a response to your (unrelated) query even for differing IDs. Depending on the settings, the inverters also send unsolicited frames that can trigger bugs. I guess I have to fix up the timeseries-script, but not sure when I'll get to around to touch that (I really don't want to). PRs welcome as always.

matzeruegge commented 9 months ago

Similar issue with timeseries2csv.py here. Firmware 2.3.5673. The cmd

./timeseries2csv.py --header-format simple --host 192.168.201.128 --resolution minutes 1

mostly (80%) ends up with different exceptions:

BaboshkaGit commented 9 months ago

I am having the same issue as @matzeruegge with Firmware 2.3.5673. I am not able to read anything from the inverter.

42Bastian commented 9 months ago

Not using timeseries2csv.py but rctclient and it works as before with 2.3.5673 here.

amokinside commented 9 months ago

same issue here.....This error occurs every 4-5 calls.......since update to 2.3.5673

openhabian@openhabian:~/.local/bin $ ./rctclient read-value --host 192.168.1.41 --id 0x0x400F015B Frame complete, but buffer still contains 696 bytes Received unexpected frame, ID is 0x5570401B, expected 0x400F015B

matzeruegge commented 9 months ago

I made changes to the exception handling in frame.py without accounting for potential side effects. Specifically, I replaced the original line 427, "raise InvalidCommand(str(exc), cmd, i) from exc," with the line "pass." Consequently, CSV files are now being created.

42Bastian commented 9 months ago

same issue here.....This error occurs every 4-5 calls.......since update to 2.3.5673

openhabian@openhabian:~/.local/bin $ ./rctclient read-value --host 192.168.1.41 --id 0x0x400F015B Frame complete, but buffer still contains 696 bytes Received unexpected frame, ID is 0x5570401B, expected 0x400F015B

I see these messages only if the Android or iOS app is also running. The FW is not good at multiple accesses.

sfudeus commented 9 months ago

2.3.5673 really seems to be a regression here. rctmon is running fine for me, but subjectively, the RCT app has gotten a lot slower showing recent values on startup.

codingandmore commented 7 months ago

The issue here is that the inverter seems to be by default in the "READ PEROIDICALLY" mode. I am not sure if this is new since 2.3.573. It sends every 30s a set of frames for predefined oids, even if you do not send any command at all to the inverter. If you send the command pas.period the inverter answers with 30. The frames arrive in multiple chunks spread around the 30s interval. I did not find an easy way to deal with this. The easiest way might be to just work with the receiving data. But this limits you to 30s intervals and the predefined oids. If you send additional commands this works sometimes but not reliable. After a couple of times it interferes with the periodically sent data and gets a hickup. I also tried to make a select() call before sending commands to see if data are available and send a command only when there are none. But this did not help. The best solution I found is to set a short timeout of e.g. 3s and retry if there is a timeout.

Probably it is better to reset the READ PEROIDICALLY as documented by setting pas.periodto 0. I did not try this approach so far as it might be a bit risky. I am not sure if this has unwanted side effects. Does e.g the app rely on these data?

There are some more strange findings. The set of oids is not equal in each chunk. Every other chunk contains a `net.slave_data'. And there seems to be an end-of-series marker 0x1B 0x1B 0x1B. The list of oids sent periodically is:

'battery.stored_energy',
'battery.used_energy',
'battery.voltage',
'battery.bms_software_version',
'battery.bms_sn',
'g_sync.p_acc_lp',
'battery.soc',
'battery.temperature',
'dc_conv.dc_conv_struct[0].u_sg_lp',
'dc_conv.dc_conv_struct[1].u_sg_lp',
'dc_conv.dc_conv_struct[0].p_dc_lp',
'dc_conv.dc_conv_struct[1].p_dc_lp',
'energy.e_ext_total',
'fault[0].flt',
'fault[1].flt',
'fault[2].flt',
'fault[3].flt',
'buf_v_control.power_reduction_max_solar_grid',
'energy.e_grid_feed_total',
'energy.e_grid_load_total',
'g_sync.p_ac_load_sum_lp',
'energy.e_load_total',
'g_sync.i_dr_eff[0]',
'g_sync.i_dr_eff[1]',
'g_sync.i_dr_eff[2]',
'g_sync.u_l_rms[0]',
'g_sync.u_l_rms[1]',
'g_sync.u_l_rms[2]',
'g_sync.p_ac_grid_sum_lp',
'grid_pll[0].f',
'energy.e_dc_total[0]',
'energy.e_dc_total[1]',
'energy.e_ac_total',
'io_board.s0_external_power',
'energy.e_ac_day',
'battery.stack_software_version[0]',
'battery.stack_software_version[1]',
'battery.stack_software_version[2]',
'battery.stack_software_version[3]',
'battery.stack_software_version[4]',
'battery.stack_software_version[5]',
'battery.stack_software_version[6]',
'energy.e_dc_day[0]',
'energy.e_dc_day[1]',
'energy.e_ext_day',
'energy.e_grid_feed_day',
'energy.e_grid_load_day',
'energy.e_load_day',
'prim_sm.island_flag',
'parameter_file',
'net.slave_data'

I fear there is no easy fix with the existing parser. I started working on a new reader/parser tolerating frames not being requested and filtering out only those requested.

If someone has taken the risk to reset the series by setting pas.periodto 0 please share the results here if this prevents getting unrequested frames and makes the socket "silent".

42Bastian commented 7 months ago

Since the protocol is based on TCP there must be some client in the network which reads these values constantly. Did you try to find out to whom the inverter is sending the data?

codingandmore commented 7 months ago

I did not do any tests so far, but I assume that as soon as any client connects to port 8899 the server will send the frames. I fear that the READ_PERIODICAL command is a global setting per device and not a setting per connection as one would expect. You could try to open two connections in parallel to verify this. I have no idea whats the purpose of this and if there is an intended consumer. It would also be interesting if this was introduced by the latest firmware.

SaxonWood commented 7 months ago

The issue here is that the inverter seems to be by default in the "READ PEROIDICALLY" mode. I am not sure if this is new since 2.3.573. It sends every 30s a set of frames for predefined oids, even if you do not send any command at all to the inverter. If you send the command pas.period the inverter answers with 30. The frames arrive in multiple chunks spread around the 30s interval. I did not find an easy way to deal with this. The easiest way might be to just work with the receiving data. But this limits you to 30s intervals and the predefined oids. If you send additional commands this works sometimes but not reliable. After a couple of times it interferes with the periodically sent data and gets a hickup. I also tried to make a select() call before sending commands to see if data are available and send a command only when there are none. But this did not help. The best solution I found is to set a short timeout of e.g. 3s and retry if there is a timeout.

Probably it is better to reset the READ PEROIDICALLY as documented by setting pas.periodto 0. I did not try this approach so far as it might be a bit risky. I am not sure if this has unwanted side effects. Does e.g the app rely on these data?

There are some more strange findings. The set of oids is not equal in each chunk. Every other chunk contains a `net.slave_data'. And there seems to be an end-of-series marker 0x1B 0x1B 0x1B. The list of oids sent periodically is:

'battery.stored_energy',
'battery.used_energy',
'battery.voltage',
'battery.bms_software_version',
'battery.bms_sn',
'g_sync.p_acc_lp',
'battery.soc',
'battery.temperature',
'dc_conv.dc_conv_struct[0].u_sg_lp',
'dc_conv.dc_conv_struct[1].u_sg_lp',
'dc_conv.dc_conv_struct[0].p_dc_lp',
'dc_conv.dc_conv_struct[1].p_dc_lp',
'energy.e_ext_total',
'fault[0].flt',
'fault[1].flt',
'fault[2].flt',
'fault[3].flt',
'buf_v_control.power_reduction_max_solar_grid',
'energy.e_grid_feed_total',
'energy.e_grid_load_total',
'g_sync.p_ac_load_sum_lp',
'energy.e_load_total',
'g_sync.i_dr_eff[0]',
'g_sync.i_dr_eff[1]',
'g_sync.i_dr_eff[2]',
'g_sync.u_l_rms[0]',
'g_sync.u_l_rms[1]',
'g_sync.u_l_rms[2]',
'g_sync.p_ac_grid_sum_lp',
'grid_pll[0].f',
'energy.e_dc_total[0]',
'energy.e_dc_total[1]',
'energy.e_ac_total',
'io_board.s0_external_power',
'energy.e_ac_day',
'battery.stack_software_version[0]',
'battery.stack_software_version[1]',
'battery.stack_software_version[2]',
'battery.stack_software_version[3]',
'battery.stack_software_version[4]',
'battery.stack_software_version[5]',
'battery.stack_software_version[6]',
'energy.e_dc_day[0]',
'energy.e_dc_day[1]',
'energy.e_ext_day',
'energy.e_grid_feed_day',
'energy.e_grid_load_day',
'energy.e_load_day',
'prim_sm.island_flag',
'parameter_file',
'net.slave_data'

I fear there is no easy fix with the existing parser. I started working on a new reader/parser tolerating frames not being requested and filtering out only those requested.

If someone has taken the risk to reset the series by setting pas.periodto 0 please share the results here if this prevents getting unrequested frames and makes the socket "silent".

I cannot confirm this for my RCT 10.0, same SW version 2.3.5673.

Environment: strictly one client via LAN only; Wifi off; service portal off; RCT portal off. And: 'pas.period' set to zero (0) and then back to default=30, i.e. no periodic reads should be active.

Test 1: When I open a TCP conn to the inverter and do nothing (no read nor write, some regular TCP keep alives only) -> no unsolicited frame is sent from the inverter. Test 2: When I open a TCP conn to the inverter and then read some 9 OIDs within the very same connection, the matching OID responses come and also after that - while keeping the TCP connection established with TCP keep alives - the inverter does not send any unsolicited frame for 2 minutes (then I stopped the test).

But yes, the 'invalid command' and numerous timeouts (no response for read cmds and timeseries) are also present in my regular setup. For the time being, I hide this with read retries.

42Bastian commented 7 months ago

I have 2.3.5674 on the 10.0 inverter and see pas.period == 30. I only see errors when I access the inverter at same time via App. I increased the retry max. to 5 and leave my logging script if any reading timeouts.

codingandmore commented 7 months ago

@SaxonWood Interesting! My app is not active and the portal not open (means no browser has a connection, the inverter still has access to the internet). So may be this is some special state on my side only. Not sure what could have caused this then. May be I should try to set to pas.period to 0.

@42Bastian Yes I also catch timeouts and do a retry as a workaround. This seems to work but is not very nice.

if the app sets pas.period on each start it would be safe to turn this off. Needs more testing.

42Bastian commented 7 months ago

Maybe not related, but since I set the router port the inverter is connected to to 100MB, I see fewer errors.

SaxonWood commented 6 months ago

As we all know, the inverter had changed its behaviour with firmware version 2.3.5673. The "invalid command" comes from the fact that from time to time the inverter either:

......2d2bxx......002b..... -> the first "2b" not a valid frame start because 2d must not preceed 2b to be a START_TOKEN. If decoded with the current rctclient 'consume' fct, this give "invalid command xx" (xx likely in its decimal notation)

......2b2b........002b.... -> the first "2b" is not a valid frame start either. If decoded with the rctclient 'consume' fct, this give "invalid command 43", because 2b (hex) = 43 (dec).

To overcome this, I'm testing a modified version of the consume function with a new optional "strict" option, where the START_TOKEN must be preceeded with "00". Note that the preceeding "00" is not generally there, but normally in responses from the inverter. Decoding of requests to the inverter etc. should not expect the "00" before the "2b". So far, the "invalid command" errors is gone. However, this does not resolve yet the timeouts as such! I may share the new 'consume' function after some more testing.

Other suggestions welcome!

codingandmore commented 6 months ago

@SaxonWood I have encountered similar issues. I have written a new package parser and a reader sample application based on the code of this repo here. It does not address all the issues but I can read data over a couple of weeks now. This code is work-in-progress and under development. Feel free to take a look at this. The code is here. Example usage of the parser is here. See readme for improvements. If this is stabilized we could make PR if there is interest.