john30 / ebusd-esp32

Firmware for ESP32-C3 allowing eBUS communication for ebusd (https://github.com/john30/ebusd)
https://adapter.ebusd.eu/v5
23 stars 1 forks source link

Adapter v5 comm errors #51

Closed jabdoa2 closed 3 months ago

jabdoa2 commented 4 months ago

I am a proud owner a new ebus Adapter v5. Setup has been a breeze and everything started working somehow. I connected my Vailant ecoTEC plus heater (WW not connected). Most of the time when I connect ebusd (via Wifi) it works just fine. However, I experience a lot of errors such as: ERR: wrong symbol received, retry, device status: eBUS comm error: overrun or arbitration start error. Sometimes, ebus also fails to discover my heater due to those errors. 90% of the time it works. The heater is connected via a 2m CAT5e cable (green twisted pair) and there is only the adapter and the heater on the bus.

I am not sure how to approach debugging this. I already tried:

Verifying Wifi:

ping -i 0.1 10.10.0.230
# [...]
188 packets transmitted, 188 received, 0% packet loss, time 18856ms
rtt min/avg/max/mdev = 0.984/7.790/255.999/28.702 ms, pipe 3

Some spikes to up to 100ms but that is expected from Wifi. It uses the nearest access point. No loss.

Setting EBUDS_OPTS:

EBUSD_OPTS="--scanconfig -d ens:10.10.0.230:9999 --latency=20000 --receivetimeout=100000 --logareas bus --loglevel info --scanconfig=full"

I also tried less aggressive settings for latency but it does not seem to make any difference.

Checking baud and delay:

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on TCP port. updating verbosity.
init_ebus: clock frq 80000000 div 33 1/3, src clk 4
init_ebus: detected: 2426 Hz on 14 edges with 1979 +/1979 - edge width, 961 H/1016 L pulse width, i.e. 400 us H/423 us L period
init_ebus: master 8 0x31 delay=0 us [-4-0]

Seems ok to me. This is generally either 2424 Hz or 2426 Hz. Here I reduced the delay in the adapter. Otherwise it will usually show 196us.

Not sure if that matter but when I experience issues and rerun ebus -v it looks quite different:

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on TCP port.
init_ebus: clock frq 80000000 div 33 1/3, src clk 4
init_ebus: detected: 4761 Hz on 1023 edges with 1088 +/1008 - edge width, 15 H/992 L pulse width, i.e. 6 us H/413 us L period
init_ebus: master 8 0x31 delay=-4 us [-4-4]

Again delay is super low because I experimented with it. Did not change anything. Otherwise it would should something around 160-200.

I enabled raw logging and ebus prints this log:

2024-03-10 20:57:28.011 [bus info] send message: 3108b509030d1800
2024-03-10 20:57:28.134 [bus notice] >3108b5
2024-03-10 20:57:28.134 [bus error] send to 08: ERR: SYN received, retry
2024-03-10 20:57:28.264 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 20:57:28.844 [bus notice] >3108b5<a1
2024-03-10 20:57:29.054 [bus error] device status: eBUS comm error: overrun
2024-03-10 20:57:29.054 [bus notice] >3108b509030d180036<00031a0300b3>00
2024-03-10 20:57:31.060 [bus info] send message: 3108b509030d1800
2024-03-10 20:57:31.164 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 20:57:31.744 [bus notice] >3108b5<11ff
2024-03-10 20:57:31.829 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 20:57:32.414 [bus notice] >3108b5<14
2024-03-10 20:57:32.454 [bus notice] >3108b5
2024-03-10 20:57:32.454 [bus error] send to 08: ERR: SYN received
2024-03-10 20:57:32.454 [bus error] send message part 0: ERR: SYN received
2024-03-10 21:08:23.086 [bus info] send message: 3108b509030d1800
2024-03-10 21:08:23.216 [bus notice] >3108b5
2024-03-10 21:08:23.217 [bus error] send to 08: ERR: SYN received, retry
2024-03-10 21:08:23.346 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:08:23.927 [bus notice] >3108b5<35f5
2024-03-10 21:08:24.049 [bus error] device status: eBUS comm error: overrun
2024-03-10 21:08:24.121 [bus notice] >3108b509030d180036<0003e8020067>00
2024-03-10 21:08:26.126 [bus info] send message: 3108b509030d1800
2024-03-10 21:08:26.226 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:08:26.806 [bus notice] >3108b5<05fd
2024-03-10 21:08:26.896 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:08:27.476 [bus notice] >3108b5<14
2024-03-10 21:08:27.667 [bus notice] >3108b509030d180036<0003e60200a3>00
2024-03-10 21:08:29.672 [bus info] send message: 3108b509030d1800
2024-03-10 21:08:29.947 [bus notice] >3108b509030d180036<0003e60200a3>00
2024-03-10 21:08:31.952 [bus info] send message: 3108b509030d1800
2024-03-10 21:08:32.056 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:08:32.637 [bus notice] >3108b5<05fd
2024-03-10 21:08:32.812 [bus info] send/receive symbol latency 8 - 55 ms
2024-03-10 21:08:32.825 [bus notice] >3108b509030d180036<0003e402008f>00

Not sure if that helps.

Probably unrelated: bai Status complains (and never succeeds unlike other commands):

$ ebusctl read -m 1 -c bai Status
ERR: invalid position in decode

Log:

2024-03-10 21:10:57.096 [bus info] send message: 3108b5110103
2024-03-10 21:10:57.197 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:10:57.777 [bus notice] >3108b5<35f5
2024-03-10 21:10:57.987 [bus notice] >3108b511010324<000000>00
2024-03-10 21:11:33.592 [bus info] send message: 3108b5110103
2024-03-10 21:11:33.697 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:11:34.277 [bus notice] >3108b5<35f5
2024-03-10 21:11:34.369 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:11:34.947 [bus notice] >3108b5<05fd
2024-03-10 21:11:38.538 [bus info] send message: 3108b5110103
2024-03-10 21:11:38.657 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:11:39.447 [bus notice] >3108b5<35f5
2024-03-10 21:11:39.547 [bus error] send to 08: ERR: wrong symbol received, retry
2024-03-10 21:11:40.127 [bus notice] >3108b5<a5ff
2024-03-10 21:11:40.287 [bus notice] >3108b511010324<000000>00

Any idea what I should try? Does this look cable related? Anything else?

jabdoa2 commented 4 months ago

Looks like the recent firmware upgrade to 20240317 fixed it. Not sure why but those issues are now gone.

john30 commented 4 months ago

which one did you use before @jabdoa2 ?

jabdoa2 commented 4 months ago

which one did you use before @jabdoa2 ?

The one which has been pre-flashed: 2024-01-06.

There are still occasional transfer errors but rare enough (every few seconds) so that it does not matter much in practice.

john30 commented 3 months ago

maybe give it another try with the recently updated release

jabdoa2 commented 3 months ago

maybe give it another try with the recently updated release

I will. Thanks for your work :-).

jabdoa2 commented 3 months ago

2024-03-30/1[431e] looks very good. All warnings and errors seem to be gone. Thanks!

jabdoa2 commented 3 months ago

I checked again and I now see this error:

2024-03-31 22:18:22.240 [bus error] poll bai PartloadHcKW failed: ERR: SYN received
2024-03-31 22:18:28.213 [update notice] sent poll-read bai PrEnergyCountHc1 QQ=31: 25265
2024-03-31 22:18:34.354 [update notice] sent poll-read bai PumpHours QQ=31: 2255
2024-03-31 22:18:40.378 [bus notice] arbitration won in invalid state skip
2024-03-31 22:18:40.591 [bus error] poll bai Status01 failed: ERR: SYN received
2024-03-31 22:18:40.686 [bus notice] arbitration won in invalid state ready
2024-03-31 22:18:46.158 [update notice] sent poll-read bai Status02 QQ=31: on;60;53.0;80;15.0

Never see that arbitration won in invalid state xxx error before. Generally, errors seem to happen much less frequently.

stingone commented 3 months ago

Same here. after updating the firmware to the latest almost no communication. it throws on the adapter itself host: errors: protocol

and at the addon at lot of the errors below:

[bus notice] arbitration won in invalid state receive response 2024-04-12 23:06:02.682 [bus notice] arbitration won in invalid state skip 2024-04-12 23:06:02.905 [main error] scan config 15: ERR: SYN received 2024-04-12 23:06:04.974 [main error] scan config 76: ERR: SYN received

jabdoa2 commented 3 months ago

For me it did get better with each version. Generally, it is rather stable. Sometimes I have to restart ebusd to rescan but most of the time it just works and once it works it also keeps working. Your case sounds like a regression (as it is the opposite - it got worse) so you might want to create a new issue for that. In this issue there is very few left to do so I will close it soonith.

stingone commented 3 months ago

It now even drops.
2024-04-12 23:15:10.237 [bus error] unable to open 192.168.1.68:9999: ERR: generic I/O error 2024-04-12 23:15:10.237 [bus notice] device invalid

Just a update. just revert back to firmware 20240106 and its working again

jabdoa2 commented 3 months ago

It now even drops. 2024-04-12 23:15:10.237 [bus error] unable to open 192.168.1.68:9999: ERR: generic I/O error 2024-04-12 23:15:10.237 [bus notice] device invalid

Just a update. just revert back to firmware 20240106 and its working again

Please create a separat issue. This is definitely something else. Also please consider that this is an open source project and consider if this attitude is helping your case.