Closed Ivan-L closed 2 years ago
Hi @Ivan-L this still seems related to the USR-W630 (KeyError: 0
from umodbus)
https://github.com/kellerza/sunsynk/issues/18#issuecomment-1047524344
Hi @kellerza
I've been trying to debug this one and it was difficult. You are correct, this may well be related to the USR-W630, but I do not think it's the exact same issue as in #18. I have a suspicion that what is causing it is the number of registers being requested to be read or the response payload size being chunked/truncated somehow.
I can reproduce the issue using the modbus cli:
modbus -s 1 -v 10.0.1.66:8899 250 251 252 253 254 255 256 257 258 259 260 261
A successful request/response will yield:
→ < 56 47 00 00 00 06 01 03 00 fa 00 0c >
← < 56 47 00 00 00 1b 01 03 18 00 64 02 76 03 3e 05 fa 07 08 08 98 13 88 13 88 13 88 13 88 13 88 13 88 > 33 bytes
An unsuccessful request/response (which happens more often than the successful one) yields:
→ < fc de 00 00 00 06 01 03 00 fa 00 0c >
← < fc de 00 00 00 0e 01 03 18 00 64 02 76 03 3e 05 fa 07 08 08 > 20 bytes
Notice that in the unsuccessful case, the response is shorter.
Let's decode and analyse the successful request/response first:
Request:
MBAP (Transaction Id, Protocol, Length, Unit Id): (22087, 0, 6, 1)
PDU (Function code, Start address, Quantity): (3, 250, 12)
Response:
MBAP (Transaction Id, Protocol, Length, Unit Id): (22087, 0, 27, 1)
PDU (Function code, Byte count): (3, 24)
Values: (100, 630, 830, 1530, 1800, 2200, 5000, 5000, 5000, 5000, 5000, 5000)
Now for the unsuccessful request/response:
Request:
MBAP (Transaction Id, Protocol, Length, Unit Id): (64734, 0, 6, 1)
PDU (Function code, Start address, Quantity): (3, 250, 12)
Response:
MBAP (Transaction Id, Protocol, Length, Unit Id): (64734, 0, 14, 1)
PDU (Function code, Byte count): (3, 24)
Values: (100, 630, 830, 1530, 1800, ..truncated!
The response response PDU is truncated, but the response MBAP header correctly reflects the truncated PDU's size! In other words, the "envelope" is correct, but the "body" or "what's inside" is truncated in the response.
If I reduce the amount of registers being read to 8 or below, instead of 12 above, then I do not get this error at all and every read succeeds. But as soon as I read 9 registers or more, then the error occurs randomly and when the error does occur, the response PDU payload appears to be truncated at the exact same place. The System Mode profile of the add-on does require all 12 registers to be read, though I do think that they can be read in batches.
I have modified the code locally to place a limit on how many registers can be read at a time.
The group_sensors method would gain a max_group_size
parameter:
def group_sensors(
sensors: Sequence[Sensor],
allow_gap: int = 3,
max_group_size: int = 60
) -> Generator[list[int], None, None]:
"""Group sensor registers into blocks for reading."""
if not sensors:
return
regs = {r for s in sensors for r in s.reg_address}
group: List[int] = []
adr0 = 0
for adr1 in sorted(regs):
if group and (adr1 - adr0 > allow_gap or len(group) >= max_group_size):
yield group
group = []
adr0 = adr1
group.append(adr1)
if group:
yield group
It can then be invoked with that parameter from read_sensors:
for grp in group_sensors(sensors, allow_gap=1, max_group_size=8):
A possible test could be:
def test_group_limit() -> None:
sen = [
Sensor(10, "10"),
Sensor(11, "11"),
Sensor(12, "12"),
Sensor(13, "13"),
Sensor(14, "14"),
Sensor(15, "15"),
Sensor(16, "16"),
]
g = list(group_sensors(sen, max_group_size=6))
assert g == [[10, 11, 12, 13, 14, 15], [16]]
Unfortunately apart from the above pytest succeeding locally, I cannot seem to build the addon with my locally modified version of the sunsynk package. The hass-addon-sunsynk-dev Dockerfile has a hardcoded dependency of version 0.1.4 of the sunsynk package. Any idea how I can test my changes locally before doing a PR?
Actually, before I even attempt a PR, would you accept such a PR? Or do you think the max group size should be configurable and be an option in config.yaml?
I figured out how to test locally by manually applying what copy2local.cmd does.
I confirm that limiting the add-on to reading a maximum of 8 registers at a time solves the above problem - the issue has not reoccurred even once and I have restarted the add-on many times.
So my above question still stands - do you think this should be a setting in config.yaml with some high default value?
@Ivan-L thanks for looking into this. Will surely accept a PR.
An option with a high default (50?) would make sense. It should be fairly easy to add once the library supports it
Sorry for the delay @kellerza, I got sidetracked by a lot of other stuff. The PR is up.
Firstly, thanks again for a great add-on! The profiles feature is extremely useful!
Describe the issue/bug
On startup, the add-on (I'm currently using the dev version) refreshes the system_mode profile that I have opted into. The below error occurs, causing the add-on to crash and restart. This happens several times, perhaps 7 or 8 times, until one time the system_mode update will succeed and then all is well until the next restart. This causes startup to essentially be delayed by several minutes, but eventually the add-on does start up.
I am using a USR-W630 via TCP using the umodbus driver so that may contribute to the issue.
I will perform more experiments and post additional info as I receive it.
Expected behavior
The add-on should start up the first time without the exception being thrown.
Logs