Closed ArwedL closed 3 years ago
Ups, you're right, i confused process_telegram
an incoming_telegram
. The short reaction time is on master poll to tx. I added code to check the time from receiving interupt to tx transmit call, and sometimes this takes more than 20 ms. I wonder what happens in this time?
that is odd. Tx should happen immediately after a poll. What you could try is to comment out the sensor_.start()
and sensors_.loop()
in emsesp.cpp
. This made my setup run a zillion times faster.
@proddy
i added the RX_LOOP_WAIT because I thought it was slowing down the telnet
i thought about that, but we have only a few messages per second and if there is no message the function also returns. I see no benefit. To give telnet/wifi more time i think it's better to add a delay in the main loop. Since the tx-reaction is complete in the emsuart_recvTask
, this tasks should have enough time to complete. I'm trying now a delay(MYESP_DELAY)
as in 1.9 in the EMSESP::loop()
and it seems to help. Also terminal seems more responsive.
@MichaelDvP I found and fixed the issue that was causing Tx to fail with the old logic (tx_mode 1). The value of the timeout was too short. Should be 1760. I suspect a typo in the macro when it was copied over from 1.9.5.
I still can't get the newer Tx code to work (tx_mode 4). Giving me the same errors. Perhaps also a timing error?
I don't believe it is the timeout value. I've changed it as i was working on uart, the 10 seems to me as a typo.
The timeout is counting loops, each loop is EMSUART_BUSY_WAIT
long, which is 1/8 bittime. With EMS_TX_TO_COUNT
set to 22 8 10 you get a timeout of 22 8 10bittime/8, e.g. 220 bittimes or 22000 µs. If you wait so long for one byte, you'll get a collision with the next master-poll in the second byte. With 22 8 the timeout is 22 bittimes, 2200µs, a bit more than the EMS+ fixed wait.
The error messages indicates that there is no response from the destination, right? So first question, did we send? Can we receive the echo or is it missing? If there is a echo, what do we receive next, What message is recieved, that triggers the rx, but does not match the tx_waiting? We should log the raw telegrams including break direct in recvTask to see what's come in. For timing it can be usefull to increase the priority of the recvTask.
I logged with syslog to see the tx-errors and there is a another strange thing. I get reboots every 2 hours (mark is set to 2h, can it be that?) and always after 4 complete tx-errors and the first error of 5th. But the time between retries is very long, seems the counter is'nt cleared in between.
Another thing: [system] and [network] logs with local time (mest), [emseesp] (and also [boiler], [thermostat]) logs with utc. syslog.txt
I don't believe it is the timeout value. I've changed it as i was working on uart, the 10 seems to me as a typo. The timeout is counting loops, each loop is
EMSUART_BUSY_WAIT
long, which is 1/8 bittime. WithEMS_TX_TO_COUNT
set to 22 8 10 you get a timeout of 22 8 10bittime/8, e.g. 220 bittimes or 22000 µs. If you wait so long for one byte, you'll get a collision with the next master-poll in the second byte. With 22 8 the timeout is 22 bittimes, 2200µs, a bit more than the EMS+ fixed wait.The error messages indicates that there is no response from the destination, right? So first question, did we send? Can we receive the echo or is it missing? If there is a echo, what do we receive next, What message is recieved, that triggers the rx, but does not match the tx_waiting? We should log the raw telegrams including break direct in recvTask to see what's come in. For timing it can be usefull to increase the priority of the recvTask.
You're right, 1760ms is a long time within the loop. I'd rather just forget the "tx_mode 1-3" and work on your new and improved Tx logic and figure out why it doesn't work on my setup. I'll also ask BBQKees is he's willing to try out a few things on his boiler. Is there anything specific with your environment? There is a difference with timings between EMS+ and EMS1.0 and I'm on EMS1.0.
I logged with syslog to see the tx-errors and there is a another strange thing. I get reboots every 2 hours (mark is set to 2h, can it be that?) and always after 4 complete tx-errors and the first error of 5th. But the time between retries is very long, seems the counter is'nt cleared in between.
Another thing: [system] and [network] logs with local time (mest), [emseesp] (and also [boiler], [thermostat]) logs with utc. syslog.txt
I'll create a separate issue for this and track it there.
I've logged the time from rx-intr to send and found the it's always this check:
if (millis() > (emsRxTime + EMS_RX_TO_TX_TIMEOUT)) { // send allowed within 20 ms
return EMS_TX_WTD_TIMEOUT;
that cause the error. I replaced this code by
LOG_DEBUG(F("Responsetime: %d"), uuid::get_uptime() - emsRxTime);
and it's mainly 0 or 1 but sometimes 29 ms (no other values), but does not give a collision with next telegram. We should skip this check completly.
I see a responsetime of 0 and sometimes 1 with "tx_mode 1". I just can't get tx_mode 4 working, even by adding some delay after each bit write. I'm so busy with the web interface I don't really have time to get the scope out to see how the timings are off on my EMS 1.0 system.
Closing this. Covered in emsesp/EMS-ESP#398
Question Do you have an implementation for ESP32 uart handling? Currently I am only interested in receiving. Unfortunately I have to ESP32 because it also servers other purposes.
Additional context I have seen another question in Q&A (May 2019) where you stated that you aren't happy with your current ESP32 implementation - I hope now for some new status (especially if receiving only is relevant)...