jeelabs / esp-link

esp8266 wifi-serial bridge, outbound TCP, and arduino/AVR/LPC/NXP programmer
Other
2.82k stars 720 forks source link

MQTT - hub->arduino communication broken #459

Open maciejs20 opened 4 years ago

maciejs20 commented 4 years ago

The communication between MQTT hub -> EspLink -> Arduino is broken when the MQTT hub sends data to arduino. I'm using espLink 3.2.47-g9c6530d with recent El-client libraries, connected to IBM Cloud Watson IoT Platform. All arduino-initiated PUBs/SUBs works OK, but any longer hub-initiated communication fails. I've been poking with older EspLink code and at that time any message longer than approx. 32 bytes fails; I don't know if that is the case for 3.2.47 as well but probably it's the same. Increase in EL-Client buffer does not help at all, even when set to 300bytes.

I've connected serial RX on the ESP-Arduino communication line and have collected the logs. I've also modified EL-Client for additional debug, shown below.

The reason is quite simple: when Esp-Link sends long message, the arduino serial buffer overflows. If I compile HardwareSerial with -D SERIAL_RX_BUFFER_SIZE=256, than communication is OK, but also the huge amount of arduino's ram is gone.

I think that Esp-link should divide longer packets...

What happens, how to replicate:

  1. Arduino code subscribes for topic: "iotdm-1/mgmt/custom/TM1638/display", subscribtion succedes.

  2. MQTT hub sends first message to this topic, it is received by EspLink [log from ESP-Link debug window]:

49591> MQTT: Recv type=PUBLISH id=0000 len=138; Pend type=NULL id=00 49591> MQTT: Recv PUBLISH qos=0 iotdm-1/mgmt/custom/TM1638/display{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} 49592> MQTT Client: Received topic: iotdm-1/mgmt/custom/TM1638/display, data: {"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} 49592> MQTT: Data cb=0x0000049e topic=iotdm-1/mgmt/custom/TM1638/display{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} len=99 49593> cmdResponse: cmd=3 val=1182 argc=2 49598> MQTT: Recv type=PUBLISH id=0000 len=138; Pend type=NULL id=00 49598> MQTT: Recv PUBLISH qos=0 iotdm-1/mgmt/custom/TM1638/display{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} 49599> MQTT Client: Received topic: iotdm-1/mgmt/custom/TM1638/display, data: {"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} 49599> MQTT: Data cb=0x0000049e topic=iotdm-1/mgmt/custom/TM1638/display{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}} len=99 49600> cmdResponse: cmd=3 val=1182 argc=2

  1. First message travels on serial link from ESP to Arduino [log from serial dump].
<03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}}<00><00><00>y%<03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"3e872ea6-4b48-4573-a54b-ee0210b7809e","d":{"fields":[{"field":"text","value":"ABCDEF"}]}}<00><00><00>y%
  1. ELClient (arduino) picks the first message (hex log comes from El-Client's serial read in process() function). ElClient does NOT detect the end of the packet and don't process it yet, the data waits in the buffer [log from El-Client process() function].

Process@0 C0,sEND,3,0,2,0,9E,4,0,0,22,0,69,6F,74,64,6D,2D,31,2F,6D,67,6D,74,2F,63,75,73,74,6F,6D,2F,54,4D,31,36,33,38,2F,64,69,73,70,6C,61,79,63,0,7B,22,72,65,71,49,64,22,3A,22,33,65,38,37,32,65,

  1. MQTT hub sends second message to this topic, it is received by EspLink [log from ESP-Link debug window]:

49600> cmdResponse: cmd=3 val=1182 argc=2 type=PUBLISH id=0000 len=138; Pend type=NULL id=00 127312> MQTT: Recv PUBLISH qos=0 iotdm-1/mgmt/custom/TM1638/display{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}0  127313> MQTT Client: Received topic: iotdm-1/mgmt/custom/TM1638/display, data: {"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}} 127313> MQTT: Data cb=0x0000049e topic=iotdm-1/mgmt/custom/TM1638/display{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}0  len=99 127314> cmdResponse: cmd=3 val=1182 argc=2 127318> MQTT: Recv type=PUBLISH id=0000 len=138; Pend type=NULL id=00 127318> MQTT: Recv PUBLISH qos=0 iotdm-1/mgmt/custom/TM1638/display{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}0  127319> MQTT Client: Received topic: iotdm-1/mgmt/custom/TM1638/display, data: {"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}} 127319> MQTT: Data cb=0x0000049e topic=iotdm-1/mgmt/custom/TM1638/display{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}0  len=99 127320> cmdResponse: cmd=3 val=1182 argc=2

  1. Second message travels on serial link from ESP to Arduino [log from serial dump]
<03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"0b179894-f19f-4fcc-b8c7-d6cb1d98eab1","d":{"fields":[{"field":"text","value":"123456"}]}}<00><00><00><03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"0b179894-f19f-4fcc-b8c7-d6cb1d98eab1","d":{"fields":[{"field":"text","value":"123456"}]}}<00><00><00><03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}<00><00><00><00>"<03><00><02><00><9E><04><00><00>"<00>iotdm-1/mgmt/custom/TM1638/displayc<00>{"reqId":"ff360b9b-cc4a-4d8e-943f-ddc0d0f12c9b","d":{"fields":[{"field":"text","value":"123456"}]}}<00><00><00><00>"
  1. El-Client (arduino) picks up the second message messed up with the first one and process part of it (CRC fails, "Final" is the buffer passed to other EL-Client methods). [log from El-Client process() function]

Process@62 C0,sEND, Final: >>____"iotdm-1/mgmt/custom/TM1638/displayc{"reqId":"3e872e_<< -> 3,0,2,0,9E,4,0,0,22,0,69,6F,74,64,6D,2D,31,2F,6D,67,6D,74,2F,63,75,73,74,6F,6D,2F,54,4D,31,36,33,38,2F,64,69,73,70,6C,61,79,63,0,7B,22,72,65,71,49,64,22,3A,22,33,65,38,37,32,65,0, protoCompletedCb: 63B, buf:565, cmd:3, val:49E, data:2,22,0,69,6F,74,64,6D,2D,31,2F,6D,67,6D,74,2F,63,75,73,74,6F,6D,2F,54,4D,31,36,33,38,2F,64,69,73,70,6C,61,79,63,0,7B,22,72,65,71,49,64,22,3A,22,33,65,38,37,32,65,0 ELC: Invalid CRC 30502 101 ELC: Process anyway... RESP_CB: 1182 2 com_mqttDataCb 3,0,2,0,9E,4,0,0,22,0,69,6F,74,64,6D,2D,31,2F,6D,67,6D,74,2F,63,75,73,74,6F,6D,2F,54,4D,31,36,33,38,2F,64,69,73,70,6C,61,79,63,0,7B,22,72,65,71,49,64,22,3A,22,30,62,31,37,39,38,-

uzi18 commented 4 years ago

What is your arduino board? It looks like it is by design. Maybe ELClient should inform esp-link about buffer size first. Have found some patch for this (simmilar problem) maybe we can use it.

maciejs20 commented 4 years ago

This is arduino nano, but it does not matter. The buffer size is hardcoded into main HardwareSerial.h standard arduino library - so the problem is valid for all AVR boards:

#if ((RAMEND - RAMSTART) < 1023)
#define SERIAL_TX_BUFFER_SIZE 16
#else
#define SERIAL_TX_BUFFER_SIZE 64
#endif
#endif
#if !defined(SERIAL_RX_BUFFER_SIZE)
#if ((RAMEND - RAMSTART) < 1023)
#define SERIAL_RX_BUFFER_SIZE 16
#else
#define SERIAL_RX_BUFFER_SIZE 64
#endif

Esp-Link may not expect the board to accept anything above 64bytes...

uzi18 commented 4 years ago

ok will try to investigate this, please change markings of log in first post as code, it will be more readable

uzi18 commented 4 years ago

found @dannybackx patch but it is for REST https://github.com/jeelabs/el-client/pull/37/commits/24f14c0bb340afc5330d00bf8f999fe08bd71502

I think we need first pass receive buffer size during SYNC, and if pkt is more than size of buf split it and send.