jeelabs / esp-link

esp8266 wifi-serial bridge, outbound TCP, and arduino/AVR/LPC/NXP programmer
Other
2.82k stars 720 forks source link

SLIP MQTT protocol definition for FORTH client implementation #430

Open wolfgangr opened 5 years ago

wolfgangr commented 5 years ago

Inspired by this hack https://hackaday.com/2017/02/13/hacking-on-the-weirdest-esp-module/#comment-5820116 I tried something similiar: MQTT from an STM32 Bluepill running mecrisp FORTH, connected to ESP-Link.

However, when I try to issue test strings, the communication between ESP-LINK and the STM32 breaks down. I have to reset both the µC at http://192.168.1.88/console.html and ESP-Link at http://192.168.1.88/log.html using the HTML-Buttons at those pages.

I suspect that some wrong command string confuses ESP-LINK and renders it in a non-responsive state.

I assume that the communication ESP-LINK <-> µC is neither pure MQTT nor SLIP but a dedicated protocol, correct?

Unfortunately, there is no MQTT in https://embello.jeelabs.org/flib/ yet.

I have repeatedly searched the issues and all docs for some spec, but the only source I found is the https://github.com/jeelabs/el-client/blob/master/ELClient/examples/mqtt/mqtt.ino However, assembling test strings from reverse engineering this very error prone, too.

It would be great to find a couple of preassembled test strings to ensure that the whole setup is working. So I could work down to more complicated functions by small changes. Just some "hello" appearing on my mqtt would be great.

Details from my setup:

I'd plan to rework the whole thing using string buffer (hope to find such in mecrisp) instead of blowing the stack. Would be glad to share, of course.

uzi18 commented 5 years ago

@wolfgangr I just showed you how reference slip client prepare packet for publish and for me it is wrong how data and data_len are sent /interpret

what is new composer?

wolfgangr commented 5 years ago

revealing the secret of the data len field: https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt.c#L550

bool ICACHE_FLASH_ATTR
MQTT_Publish(MQTT_Client* client, const char* topic, const char* data, uint16_t data_length,
uint8_t qos, uint8_t retain)
           ....
  uint16_t topic_length = os_strlen(topic);
           ....
  uint16_t buf_len = 3 + 2 + 2 + topic_length + data_length + 16;

I don't understand why the data_len is not derived from the len field prepending the data, but it is at it is. Since the field goes into an allocation call, better get it right. I think dubbing the length of the data field should be a good guess.

wolfgangr commented 5 years ago

@uzi18 Oops, sorry, obviously I forgot to listen while talking on....

why do you reffer to v2.2.3, as we have 3.2.2x available?

https://github.com/jeelabs/esp-link#releases--downloads says

V2.2.3 is the most recent release. The main change between 2.x and 3.x will be the addition of custom web pages (this is not ready yet).

Therefore, this was what I have installed. I did some occasional switching in git but found no difference in MQTT yet. Was it a bad Idea to believe?

wolfgangr commented 5 years ago

@uzi18

it looks like 01 00 00 00 89 02 00 00 is not a slip pkt or malformed one

Still old hackaday heritage. Well, but obviously works and does not hurt.... This is what the debug comments: 852516> SLIP: start or end len=9 inpkt=0

But if you can give me a nice correct sync, I'll change that. Or do I need no sync at all? Maybe it helps, if serial line was down for some time, or even µC at sleep?

wolfgangr commented 5 years ago

@uzi18

you can change it: zalloc -> alloc + memset,0

I'd prefer not to screw ESP-Link, but adapt to the current implementation. So I remain backwards compatible for old ESP-versions - like v2.2.3 :-) If it is 4-byte-padded - OK, I can leave it that way. Costed me some hours to find out :-\\ , but now it is fine :-))) I don't have to love it to be able to work with it...

wolfgangr commented 5 years ago

@uzi18

....wrong how data and data_len are sent /interpret what is new composer?

This is still the hackaday dummy. I just finished the topic stuff. The other 4 fields are next on my to do list. Not sure whether the rest of the nigt is long enough....

wolfgangr commented 5 years ago

@uzi18

@wolfgangr first try to use ELClient library with debug on, to dig into Mqtt implementation

Let me be honest: that was what I tried to avoid :-) I had a look into the source code. It seems to be quite elaborated and comprehensive - far beyond my needs And I needed a different setup, with some arduino or so where I could the run ELClient on, right?

wolfgangr commented 5 years ago

just pushed. This is the version with 4-byte topic padding: https://github.com/wolfgangr/forthMQTT3pktControl/blob/m013/mmq-tools.fs#L131

wolfgangr commented 5 years ago

https://github.com/jeelabs/esp-link/issues/430#issuecomment-459475810

14 00 // 1st = topic length 68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72 03 00 // topic = home/basemen /washer 6F 66 66 00 // 2nd off // here should be 03 00 ????? 02 00 // 03 00 // 3rd ?? 00 00

may I slightly disagree, please

14 00 // 1st = topic length = Dec 20
68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72 //  20 char topic = home/basemen /washer
    // no padding since 0x14 % 4 = 0
03 00 // string length of data
6F 66 66 00 // 'off' + 1 Byte padding
02 00 // length of field 'data len' 
03 00 // content of field data len - equals string len, obviously
00 00 // ... padded up to 4 bytes again
wolfgangr commented 5 years ago

I think I managed the main part of the publish protocol: https://github.com/wolfgangr/forthMQTT3pktControl/blob/m014/mmq-tools.fs#L154

My data string - audaciously including spaces - gets published heating/pressure/valve foo bar tralala

I tried different qos=2 and retain=1 and see the debugged behaviour changed:

 33283> cmdParsePacket: cmd=11 argc=5 value=0
 33283> cmdExec: Dispatching cmd=MQTT_PUB
 33283> MQTT: MQTTCMD_Publish topic=heating/pressure/valve, data_len=16, qos=2, retain=1
 33284> MQTT: Publish, topic: "heating/pressure/valve", length: 44
 33284> MQTT: Send type=PUBLISH id=0036 len=44
 33285> MQTT: Recv type=PUBREC id=0036 len=4; Pend type=PUBLISH id=36
 33286> MQTT: Send type=PUBREL id=0036 len=4
 33287> MQTT: Recv type=PUBCOMP id=0036 len=4; Pend type=PUBREL id=36

                .....
 33898> MQTT: Publish, topic: "heating/pressure/valve", length: 44
 33898> MQTT: Send type=PUBLISH id=0037 len=44
 33901> MQTT: Recv type=PUBREC id=0037 len=4; Pend type=PUBLISH id=37
 33901> MQTT: Send type=PUBREL id=0037 len=4
 33902> MQTT: Recv type=PUBCOMP id=0037 len=4; Pend type=PUBREL id=37

So, in the end, it looks like a productive night :-)

wolfgangr commented 5 years ago

What's next?

wolfgangr commented 5 years ago

regarding callback.... https://github.com/jeelabs/el-client/blob/master/ELClient/examples/mqtt/mqtt.ino hmm... They talk of bees, and flowers, when I try to talk about ..... forget it... ;-)

#include <ELClient.h>
#include <ELClientCmd.h>
#include <ELClientMqtt.h>
          .......
  mqtt.subscribe("/esp-link/1");
  mqtt.subscribe("/hello/world/#");

  // Set-up callbacks for events and initialize with es-link.
  mqtt.connectedCb.attach(mqttConnected);
  mqtt.disconnectedCb.attach(mqttDisconnected);
  mqtt.publishedCb.attach(mqttPublished);
  mqtt.dataCb.attach(mqttData);
  mqtt.setup();
wolfgangr commented 5 years ago

https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp

OK, we see a pattern:

void ELClientMqtt::lwt(const char* topic, const char* message, uint8_t qos, uint8_t retain) {
  _elc->Request(CMD_MQTT_LWT, 0, 4);
  _elc->Request(topic, strlen(topic));
  _elc->Request(message, strlen(message));
  _elc->Request(&qos, 1);
  _elc->Request(&retain, 1);
  _elc->Request();
}

The message structure is always +- the same. Some parts may be missing, so the argument count will change. My header will cry for a bit of configuratibility, maybe.

void ELClientMqtt::subscribe(const char* topic, uint8_t qos) {
  _elc->Request(CMD_MQTT_SUBSCRIBE, 0, 2);
  _elc->Request(topic, strlen(topic));
  _elc->Request(&qos, 1);
  _elc->Request();
}

The publishcommand appears to be the most comprehensive one. So, to arrive at the other ones, I may just have to omit. OK, when the whole picture becomes clear, I may well want to reorganize much of my code base.

void ELClientMqtt::publish(const char* topic, const uint8_t* data, const uint16_t len,
    uint8_t qos, uint8_t retain)
{
  _elc->Request(CMD_MQTT_PUBLISH, 0, 5);
  _elc->Request(topic, strlen(topic));
  _elc->Request(data, len);
  _elc->Request(&len, 2);
  _elc->Request(&qos, 1);
  _elc->Request(&retain, 1);
  _elc->Request();
}

https://github.com/jeelabs/el-client/blob/master/ELClient/ELClient.h#L29

  CMD_MQTT_SETUP = 10, /**< Register callback functions */
  CMD_MQTT_PUBLISH,    /**< Publish MQTT topic */
  CMD_MQTT_SUBSCRIBE,  /**< Subscribe to MQTT topic */
  CMD_MQTT_LWT, /**< Define MQTT last will */

can I rely that the integers are assigned to the label in sequence? CMD_MQTT_PUBLISH, is 11 = 0x0B then - OK, this meets what we have done so far.

wolfgangr commented 5 years ago

till now we have a good idea for the outgoing messages. To get the protocol of the incoming callback, we still need to dive lower.

Start at the surface again: https://github.com/jeelabs/el-client/blob/master/ELClient/examples/mqtt/mqtt.ino anything related to callbacks is attached to the object mqtt

ELClient esp(&Serial, &Serial);
ELClientCmd cmd(&esp);
ELClientMqtt mqtt(&esp);

relevant Steps of the protocol:

The ' data' call back is the most comprehensive one (recieves both topic and data) and the one we really need, anyway. So we focus on that: https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.h#L37 FP<void, void*> dataCb; /**< callback when a message is received, called with two arguments: the topic and the message

How is the attach implemented?

  mqtt.dataCb.attach(mqttData);
  mqtt.setup();

https://github.com/jeelabs/el-client/blob/d559214ada405739fa235b13c7e2b42175ddbe95/ELClient/FP.cpp#L48 hm. Can't see that it acutally did anyting of relevance... ?

template<class retT, class argT>
void FP<retT, argT>::attach(retT (*function)(argT))
{
    c_callback = function;
}

'============================== obviously the callback (as mqtt Data above) has to implement the correct interfaces? ah - that's all the essence beyond the OO-voodo? read two strings coming aling the line ... what a surprise ! ....

void mqttData(void* response) {
 String topic = res->popString();
 String data = res->popString();

'==============================

A closer look to the setup thing: https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp#L34

void ELClientMqtt::setup(void) {
  Serial.print(F("ConnectedCB is 0x")); Serial.println((uint32_t)&connectedCb, 16);
  _elc->Request(CMD_MQTT_SETUP, 0, 4);
  uint32_t cb = (uint32_t)&connectedCb;
  _elc->Request(&cb, 4);
  cb = (uint32_t)&disconnectedCb;
  _elc->Request(&cb, 4);
  cb = (uint32_t)&publishedCb;
  _elc->Request(&cb, 4);
  cb = (uint32_t)&dataCb;
  _elc->Request(&cb, 4);
  _elc->Request();
}

So we just send a special message to ESP-link with 4 pointer in it - nothing else? I suspect that those pointers don't have any special meaning for ESP-link, right? So I guess they will simply be returned in verbatim when the event occurs? So I assume in terms of interface definition between ESP-link and its serial client, I'd expect that I can supply any 32 bit (?) number, as long as I know what to do with that when it gets returnded with the event? Let's assume so and keep that as the plan.

wolfgangr commented 5 years ago

Lets have a second look to the subscribe code: https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp#L34

void ELClientMqtt::subscribe(const __FlashStringHelper* topic, uint8_t qos) {
  _elc->Request(CMD_MQTT_SUBSCRIBE, 0, 2);
  _elc->Request(topic, strlen_P((const char*)topic));
  _elc->Request(&qos, 1);
  _elc->Request();
}

And the example says: mqtt.subscribe("/hello/world/#");

So we have to send just one single topic string, using a subset of the commands whe have developped for the publish implementation.

wolfgangr commented 5 years ago

ready for a try? I know, there is one step left: parse the input stream and extract SLIP messages. Can we assign that task to the FORTH outer interpreter? It looks for space separated chunks and processes them. So If our 32 bit hex number reads like <spc>^2<spc> , it might be possible to call a word defined as ^2 Let's thest that...

uzi18 commented 5 years ago

@wolfgangr v3.2.x has got some fixed stuff and more features. idea of ELClient is to pack all data into slip protocol, send to esp-link, there unpack everything and use it as client. In facts it is hard to work with these sources of esp_link, some parts you found - 4 files for mqtt.

uzi18 commented 5 years ago

about this code:

void ELClientMqtt::publish(const char* topic, const uint8_t* data, const uint16_t len,
    uint8_t qos, uint8_t retain)
{
  _elc->Request(CMD_MQTT_PUBLISH, 0, 5);
  _elc->Request(topic, strlen(topic));
  _elc->Request(data, len);
  _elc->Request(&len, 2);
  _elc->Request(&qos, 1);
  _elc->Request(&retain, 1);
  _elc->Request();
}

so you see it sends header for MQTT_PUBLISH with 5 arguments. next topic length next topic with padding next data length next data with padding next 2 next len with padding next 1 next qos with padding next 1 next retain with padding in the end slip end.

now I understand, but why 2 times len is sent ? (for me this is bug)

uzi18 commented 5 years ago

About pointers yes you can send there anything what you want, but maybe it could be nice to add some magic number to know it is pointer at all, like 0xDEAD0000 ;)

wolfgangr commented 5 years ago

hack a dummy callback like : ^1 ." doing one " ; , similiar with {2|3|4} .

we assemble a MQTT-SETUP message along those lines, which looks as

20001060   20 00 60 00 0A 00 04 00   00 00 00 00 04 00 20 5E    .`..... ...... ^
20001070   31 20 04 00 20 5E 32 20   04 00 20 5E 33 20 04 00   1 .. ^2  .. ^3 ..
20001080   20 5E 34 20 00 00 00 00   00 00 00 00 00 00 00 00    ^4 .... ........

Looks like it gets processed, somehow

460305> SLIP: start or end len=9 inpkt=0
460305> SLIP: start or end len=34 inpkt=1
460305> cmdParsePacket: cmd=10 argc=4 value=0
460305> cmdExec: Dispatching cmd=MQTT_SETUP
460306> MQTT connectedCb=20315e20
460306> MQTT: Connected Cb=0x20315e20
460306> cmdResponse: cmd=3 val=540106272 argc=0
460307> SLIP: start or end len=0 inpkt=0
460308> SLIP: start or end len=4 inpkt=0

Forth tells us 540106272 hex. 20315E20. When we roll back, we see that the first callback is labeld connectedCb in the template.

On my FORTH console I read

mqtt-message mqtt-send � ok.
 ^1 

when I press enter after ^1, i get not found, althogh I can call it by ^1 doing one ok. Looks like ESP-link is behaving as expected:

wolfgangr commented 5 years ago

Next step : subscribe https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp#L34 can be distilled to

which reads im my libhackry

: subscribetopic s" foo/bar" ; 
mqtt-message stringbuf-clear 
mqtt-message CMD_MQTT_SUBSCRIBE 2 0 MQTT-cmdadd 
mqtt-message subscribetopic MQTT-stringadd
/ qos=1
mqtt-message 1 1 MQTT-numberadd 

and in hex

20001060   18 00 60 00 0C 00 02 00   00 00 00 00 07 00 66 6F   ..`..... ......fo
20001070   6F 2F 62 61 72 00 01 00   01 00 00 00 00 00 00 00   o/bar... ........

ESP-link log:

505547> SLIP: start or end len=9 inpkt=0
505548> SLIP: start or end len=26 inpkt=1
505548> cmdParsePacket: cmd=12 argc=2 value=0
505548> cmdExec: Dispatching cmd=MQTT_SUB
505548> MQTT: MQTTCMD_Subscribe topic=foo/bar, qos=0
505548> MQTT: Subscribe, topic: "foo/bar"
505549> MQTT: Send type=SUBSCRIBE id=00E2 len=14
505550> MQTT: Recv type=SUBACK id=04E2 len=5; Pend type=SUBSCRIBE id=4E2

To test subscription, I call some independent client on my linux console: mosquitto_pub -h 192.168.X.Y -t "foo/bar" -m " tralala" and read at another console running mosquitto_sub: foo/bar tralala

ESP-Link receives this as well....

925168> MQTT: Recv type=PUBLISH id=0000 len=19; Pend type=NULL id=00
925168> MQTT: Recv PUBLISH qos=0 foo/bar tralala
925168> MQTT: Data cb=0x20345e20 topic=foo/bar tralala len=8
925168> cmdResponse: cmd=3 val=540302880 argc=2
925169> SLIP: start or end len=0 inpkt=1
925172> SLIP: start or end len=10 inpkt=0

(again, 0x20345e20 = dec 540302880, so the meaning is obvious, as we have learned to read those now.)

... and forwards topic / message to my client, prepended by the 'callback' ^4 foo/ba tralala Again, the command '^4' ist not interpreted as intended :-(

However, the whole mechanism of registration, subscription, callback is proved :-)

wolfgangr commented 5 years ago

same picture when I enclose my ^1 in 0x0a aka newline smybols, instead of spaces. Have to figure out FORTH input redirection :-O , I'm afraid...

uzi18 commented 5 years ago

I see forth is not so easy ;)

wolfgangr commented 5 years ago

OK, after som googlin & RTFS, I try this hack

hook-key @ Constant old-key 
$80 stringbuffer constant mirror  
: ?mirrline mirror  stringbuf-wheretowrite 1- c@ $0d = IF mirror dup stringbuf-dump stringbuf-clear THEN ;
: mykey old-key execute dup mirror stringbuf-byte-app ?mirrline ; 
' mykey hook-key !
wolfgangr commented 5 years ago

added sharkand unshark words to switch this input line mirror hexdumper on and off. This is what I see after I send the MQTT setup sequence

20001290   17 00 80 00 6D 71 74 74   2D 6D 65 73 73 61 67 65   ....mqtt -message
200012A0   20 6D 71 74 74 2D 73 65   6E 64 0D 00 00 00 00 00    mqtt-se nd......
 � ok.
  not found.
^unshark
Start at 20001290 
           |-pos-|-len-|-data-> 

20001290   0F 00 80 00 C0 03 00 00   00 0A 5E 75 6E 73 68 61   ........ ..^unsha
200012A0   72 6B 0D 00 00 00 00 00   00 00 00 00 00 00 00 00   rk...... ........

After normal response, this sequence is inserted: C0 03 00 00 00 0A 5E after that, normal serial stream resumes. We can see, that the call to unshark is spoiled.

compare what the debug says: cmdResponse: cmd=3 val=540106272 argc=0 Looks like my callback is shifted 2 bytes? ... let' look further....

I don't see any immediate SLIP'ed response after submitting a MQTT subscribe.

This are responses to subscribed messages I would expect topic and text, too - hm....

           |-pos-|-len-|-data-> 
20001290   1B 00 80 00 C0 03 00 02   00 0A 5E 33 12 C0 C0 03   ........ ..^3....
200012A0   00 02 00 0A 5E C0 03 00   02 00 0A 5E 12 C0 0D 00   ....^... ...^....

For the moment let's focus on the format of the messages:

           |-pos-|-len-|-data-> 
20001290   1B 00 80 00 
C0 03 00 02 00 0A 5E 33 12 C0
C0 03 00 02 00 0A 5E 
C0 03 00 02 00 0A 5E 12 C0 0D 
00   ....^... ...^....

This is really weir'd.... Is my code too slow so that bytes get lost? does it relate to the work of the 0x0a character?

wolfgangr commented 5 years ago

does it relate to the work of the 0x0a character?

obviously the internals of ESP-link or the serial line itself get screwed by 0a bytes. changed them to a printable asterisk * 0x2a

The response to setup now reads:

20001290   16 00 80 00 C0 03 00 00   00 2A 5E 31 2A F7 6B C0   ........ .*^1*.k.
200012A0   20 20 75 6E 73 68 61 72   6B 0D 00 00 00 00 00 00     unshar k.......

which I'd dissassemble as

C0 - 03 00 - 00 00 - 2A 5E 31 2A - F7 6B   -  C0 

Now the subscription callbacks read

20001290   6D 00 80 00 C0 03 00 02   00 2A 5E 34 2A 07 00 66   m....... .*^4*..f
200012A0   6F 6F 2F 62 61 72 00 00   00 08 00 20 74 72 61 6C   oo/bar.. ... tral
200012B0   61 6C 61 00 00 1A 97 C0   C0 03 00 02 00 2A 5E 34   ala..... .....*^4
200012C0   2A 07 00 66 6F 6F 2F 62   61 72 00 00 00 08 00 20   *..foo/b ar..... 
200012D0   74 72 61 6C 61 6C 61 00   00 1A 97 C0 C0 03 00 02   tralala. ........
200012E0   00 2A 5E 34 2A 07 00 66   6F 6F 2F 62 61 72 00 00   .*^4*..f oo/bar..
200012F0   00 08 00 20 74 72 61 6C   61 6C 61 00 00 1A 97 C0   ... tral ala.....
20001300   0D 00 00 00 00 00 00 00   00 00 00 00 00 00 00 00   ........ ........

manually regrouped as follows

C0 - 03 00 02 00 2A 5E 34 2A 
   - 07 00 - 66  6F 6F 2F 62 61 72 - 00 00 00 
   - 08 00 - 20 74 72 61 6C 61 6C 61 - 00 00 
   - 1A 97 - C0  
C0 - 03 00 02 00 2A 5E 34 2A 
   - 07 00 - 66 6F 6F 2F 62 61 72 - 00 00 00 
   - 08 00 - 20 74 72 61 6C 61 6C 61 00 00 
   - 1A 97 - C0 
C0 - 03 00 02 00 2A 5E 34 2A 
   - 07 00 - 66 6F 6F 2F 62 61 72 - 00 00 00 
   - 08 00 20 74 72 61 6C 61 6C 61 00 00 
   - 1A 97 - C0   
0D 

which is not new to us any more

I can remember that with 0x0a in the callback, the ESPL-Log displays "cmd=3" and the ^3 aka publishedCb which is not correct, instead of the correct cmd=4 and ^4. Well, at least this is consisten with eating up topic and data... So it's ESP-link that is disturbed by nonprintable bytes in a supposed-to-be hex address? Let's avoid that quirk...

wolfgangr commented 5 years ago

So we need a two layer parser:

special case: empty data field: mosquitto_pub -h 192.168.X.Y -t "foo/bar" -m ""

20001290   1D 00 80 00 C0 03 00 02   00 2A 5E 34 2A 07 00 66   ........ .*^4*..f
200012A0   6F 6F 2F 62 61 72 00 00   00 00 00 00 00 2E 25 C0   oo/bar.. ......%.

we see that

wolfgangr commented 5 years ago

here https://github.com/jeelabs/esp-link/issues/430#issuecomment-459904852 I found

obviously the internals of ESP-link or the serial line itself get screwed by 0a bytes.

here https://github.com/jeelabs/esp-link/issues/430#issuecomment-458698280 I knew

My code template says 'Command to sync up the esp-link ' 01 00 00 00 89 02 00 00 and calls it 'sync'

I still prepend this to any of my commands. May this cause ESP-Links misbehaviour? Maybe this is broken? Maybe I should enclose this (or a correct sync) in 0xC00 aka SLIP_END ?

let's recall the Master's words:

@tve wrote here https://github.com/jeelabs/esp-link/issues/430#issuecomment-458619696

In terms of slip, this is where it happens: https://github.com/jeelabs/esp-link/blob/master/serial/slip.c

Each packet starts and ends with a SLIP_END (0xC0).

The last two bytes are the computed CRC value, which is calculated over all the bytes except the last two.

Note that any 0xC0 or 0xDB value needs to be escaped on the wire and is unescaped before calculating the checksum.

A packet consists of a 16-bit command followed by a 16-bit argument count and a 32-bit "callback value" (really an opaque token). This is all little endian.

Your best bet is probably to start with the SYNC command whose code is 0x0001 and which takes one arg in the value field, which must be non-zero, e.g., { 0xC0, 0x01, 0x00, 0x01, 0x00, 0x11, 0x11, 0x11, 0x11, crc-lo, crc-hi, 0xC0 }

You should get a response back { 0xC0, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, crc-lo, crc-hi, 0xC0 }

Have I learned to understand all that words now? My sync template seems match the prescribed format, but misses CRC and SYNC_END enclosing OK, lets do that:

wolfgangr commented 5 years ago

@uzi18

I see forth is not so easy ;)

Well, yes, it's more complicated than a pipe symbol | in linux

hook-key @ Constant old-key 
: mykey old-key execute dup shark-the-wire ;
' mykey hook-key ! 
wolfgangr commented 5 years ago

@tve Could you please confirm / correct the escape mechanism µC -> ESP-lib:

It's like " and \ on the linux shell, I suppose?

wolfgangr commented 5 years ago

Spent a hard night with stalling reads. I get 2 characters from the SLIP-string - that's it.

I hoped I might get closer to the metal, but when I look into the implementation fo my FORTH's terminal I find serial-key

   ldr r2, =Terminal_USART_DR
   ldrb tos, [r2]         @ Fetch the character

... and serial-key?

   ldr r0, =Terminal_USART_SR
   ldr r1, [r0]     @ Fetch status
   movs r0, #RXNE
   ands r1, r0

Well, I think this is bare metal...

So what? Play with the USART configuration in the STM? In my first trial, I just kept pressing 'ENTER' to get the data. So maybe I try to write 0x0d to the ESP-link before I throw a timeout?

O do we have some kind of flow control? I haven't connected any extra wires, so maybe there is XON/XOFF available? Or does this create problems instead of solving them? Should I send XON? or RTFM?

wolfgangr commented 5 years ago

neither sending 0x0d nor xon aka 0x11 helps. I see we have no RTS/CTS https://github.com/jeelabs/esp-link/issues/102#issuecomment-309255715 https://github.com/jeelabs/esp-link/issues/53#issuecomment-309255615

wolfgangr commented 5 years ago

here a transcript of debug data: any char received is logged twice, enclosed in ' ~ ' aka 0x7E

Start at 20001734 
                       |-pos-|-len-|-data-> 

20001730   61 72 00 BD 9B 00 00 01   7E 45 45 7E 53 53 7E 50   ar...... ~EE~SS~P
20001740   50 7E 4C 4C 7E 2D 2D 7E   73 73 7E 79 79 7E 6E 6E   P~LL~--~ ss~yy~nn
20001750   7E 63 63 7E 20 20 7E 6D   6D 7E 65 65 7E 6D 6D 7E   ~cc~  ~m m~ee~mm~
20001760   73 73 7E 74 74 7E 72 72   7E 2D 2D 7E 63 63 7E 6F   ss~tt~rr ~--~cc~o
20001770   6F 7E 75 75 7E 6E 6E 7E   74 74 7E 65 65 7E 64 64   o~uu~nn~ tt~ee~dd
20001780   7E 20 20 7E 73 73 7E 6C   6C 7E 69 69 7E 70 70 7E   ~  ~ss~l l~ii~pp~
20001790   2D 2D 7E 73 73 7E 65 65   7E 6E 6E 7E 64 64 7E 20   --~ss~ee ~nn~dd~ 
200017A0   20 7E 0D 0D 7E C0 C0 7E   02 02 7E 00 00 7E 00 00    ~..~..~ ..~..~..
200017B0   7E 00 00 7E 89 89 7E 02   02 7E 00 00 7E 00 00 7E   ~..~..~. .~..~..~
200017C0   DA DA 7E 6A 6A 7E C0 C0   7E C0 C0 7E 0D 0D 7E 0D   ..~jj~.. ~..~..~.
200017D0   0D 7E 0D 00 00 00 00 00   00 00 00 00 00 00 00 00   .~...... ........

in plain and with explanation: ...send+ 0x20 0x0d end of echoed command, '---------- here our response to that command begins: 0xC0 = SLIP_END 02 00 00 00 89 02 00 00 DA 6A crc 0xC0 = SLIP_END '---------- so far , so good 0xC0 = SLIP_END .... still possible, since log tells it sends two commands as answer to sync 0x0D 0x0D 0x0D this is BULLSHITT ??? Two subsequent 0XC0 , followed by several 0x0D, ???? As I see it, Transmission stalls after the SLIP-END that starts the second response. after some seconds, I press {Enter} ( µC hangs anyway, since it is in SLIP mode and doesn't forward chars to the parser) and get these intermingled with the message I'd suppose to parse.

Do I have a echo problem? IF so, how could I solve it? generate my echo locally after stripping SLIP?

lets have a look at the ESP-debug log:

wolfgangr commented 5 years ago

buffer overrun? : s ESPL-sync memstr-counted slip-send ; so I just call 's' for the same sequence of commands. The response pattern ends identical: .... C0 C0 DA If it'd hit a limit of fixed size, I'd expected a longer tail when I cut down the head.

So no blocklength, but a timing issue? Do I need a interrupt driven UART buffer? Do I want to reinvent the wheel?

wolfgangr commented 5 years ago

Do I want to reinvent the wheel?

of course not, if I do not have to. There is a interrupt-based USART2 with input ring buffer template - great. 16 Lines of forth - less than my current endeavour.

code snippets related to hardware adresses:

So let's hope that USART1 is hardware equivalent and has the same registers. ah, here: http://hightechdoc.net/mecrisp-stellaris/_build/html/interrupts.html?highlight=interrupt#table-b-interrupt-and-exception-vectors

and of course in the datasheet of the manufacturer pg: 38 memory mapped base address of different components including UARTS:

0x40004400  USART2
0x40013800  USART1

Hints for the hook addresses for forth code as interrupt handlers we can check on the console:

irq-usart2 @ .$ 
0000_478C 
irq-usart1 @ .$ 
0000_478C 

4780 30 dump 
00004780   00 00 09 75 6E 68 61 6E   64 6C 65 64 00 B5 FD F7   ...unhan dled....
00004790   AF FF 14 55 6E 68 61 6E   64 6C 65 64 20 49 6E 74   ...Unhan dled Int
000047A0   65 72 72 75 70 74 20 00   47 F8 04 6D EF F3 05 86   errupt . G..m....

Obviously both interrupts are unassigned and wait for our actions.

There is this RXNEIE thing above which appears to be a per-UART-setting. We might try just to copy it and read the manual if stuff does not work.

And there is the NVIC-EN1R, which obviously has single bits for different I/O units to enable interrupts. There is a table in the mecrisp-stellaris documentation, referring to something similiar. However, this refers to Cortex M4 architecture and lists bit 23 prio 35 for USART2, which does not match above stanza, where USART2 is counted as number 38.

BluePill's STM32F103C8T3, in contrast, is labelled as Cortex-M3 by STM. A pdf search reveals 2.3.5 Nested vectored interrupt controller (NVIC) Neither search for USART1 nor NVIC reveals the secret of this magic figure :-(

So let's RTFI..nternet and find http://www.st.com/stonline/products/literature/rm/13902.pdf

on page 199/1134 (admittedly haven't read them all, but I know ESP programmers would be happy with even a tiny subset of that ...) we find on a 'Table 61':

pos prio type of prio acronym       description            address
37   44   settable     USART1  USART1 global interrupt   0x0000_00D4 
38   45   settable     USART2  USART2 global interrupt   0x0000_00D8

This matches the USART2 settings from embello, whom we trust, of course, so we hope we can trust this table, too.

Cleaning up my browser windows, I'd like to save this link: http://hightechdoc.net/mecrisp-stellaris/_build/html/repl.html?highlight=interrupt https://sourceforge.net/p/mecrisp/discussion/general/thread/e29549a8d5/?limit=25#4afd/6b1e

emit, emit?, key and key? are the UART handlers. .... Communication is done from/to buffer variables driven by timers/interrupts

wolfgangr commented 5 years ago

first draft: https://github.com/wolfgangr/forthMQTT3pktControl/blob/327577bda876c8b911f1118d02a0b70fcf2de4cf/uart1-irq.fs ... some basic debugging and we can:

uart1. 
SR 0000 DR 000D BRR 0271 CR1 200C CR2 0000 CR3 0000 GPTR 0000 ok.
wolfgangr commented 5 years ago

This is weird. just hacked

\ switch running REPL to use irq buffered input queue
: uart1-irq_ulize  
  ['] uart1-irq-key? hook-key? !
  ['] uart1-irq-key  hook-key  !
  uart1-irq-init
;

and run it on the console:

uart1-irq_ulize   ok.
  ok.
  ok.
hook-key @ hex. 200008C0  ok.
' uart1-irq-key hex. 200008C0  ok.

Can't believe it. Hacked. Works. Full stop :-) https://github.com/wolfgangr/forthMQTT3pktControl/blob/irq_ulize/uart1-irq.fs

https://github.com/jeelabs/esp-link/issues/430#issuecomment-458762935

nice work, but why FORTH?

@uzi18 Therefore :-)

uzi18 commented 5 years ago

nice, but have no time to understand ;) http://hightechdoc.net/mecrisp-stellaris/_build/html/index.html what was a problem with bad data recv?

wolfgangr commented 5 years ago

I think that there is no input buffering in stock mecrisp, since the ASM-primitves read directly from USART registers. The forth-compiler aka REPL seems to be fast enough to read that without loss, but obviously not my approach to extract SLIP-encapsulated MQTT-frames from the input stream.

The STM32 hardware would implement RTS/CTS flow control, but on the ESP-link side, this is still on @tve 's to do list (not on the very top, I'm afraid...)

I have now (configurable) 128 byte of ring buffer, so I hope I can read data as fast as it arrives and have enough time to process then.

uzi18 commented 5 years ago

so add buffering/dma auto fill

wolfgangr commented 5 years ago

nice work, but why FORTH?

Actually I had expected that this all had long be implemented by @jcw . He's been doing a lot of great stuff in forth and for forth. And as far as I can see, ESP-log at least is living at his guthub-premises, and carries his logo, right?

And on the embello github I find credits for both mecris-stellaris-Wizard Matthias Koch and ESP-link chief maintainer @tve Thorsten van Eicken.

I can only hope that @jcw is not going to make a living with stuff I'm going to publish right now....

wolfgangr commented 5 years ago

so add buffering/dma auto fill

I hope that this is what I have done. I'm going to rethink the timing of input-SLIP processing. In my current code, it's hooked to the REPL interpreter's key wait routine call.

This is a bit of the the upside-down-wrong way around, made it hard to code/understand/debug.

I could hook into the USART-IRQ handler, but I've learned that's bad programming practice - IRQ handlers are supposed to stay as short as possible.

I could use the multitasking framework, but this I'd like to kee open for application development. The SLIP-Decoder has to handle all user input. If it stalls, the REPL stalls and forth gets unresponsive. I don't want to fiddle with that all the time.

But I think there are 1-ms-systick interrupts as well. I think it is a good point to hook SLIP-processing to that. I think that' also a good occasion to check for protocol errors, buffer overruns and other stuff that might quest for an error handler to reset system's responsiveness - both REPL and MQTT in case of whatever bad things may happen. Maybe sort of a watchdog, actually...

wolfgangr commented 5 years ago

... watchdog ....

? There is a irq-watchdog available in mecrisp-stellaris. But I think that's STM hardware and supposed to do hard resets - clear all ram as well.

When I have REPL or SLIP stalls due to communication quirks, I'd like to keep ram state and just do a simple REPL 'quit'. Clear stack and resume user prompt.

Any errorneous SLIP-message might be silently ignored.

Well, maybe it's a good idea to create a syslog entry. Is there a SLIP-command for sending arbitrary debug text from the guest-µC to the syslog server ? I hope so. Have to look. If not, I'll raise an issue... ;-)

wolfgangr commented 5 years ago

Plan:

UART-RX-irq 
   |
serial input buffer (FIFO)
   |
   SLIP-decoder
   |          |
   |     SLIP-Message-Buffer ( n lines x m char )
   |          |
   |     MQTT-Message-Parser
   |          |
   |          +---> MQTT-triggered event-handlers 
   |
command buffer (FIFO)
   |
REPL aka forth outer interpreter aka compiler

Preliminary considerations regarding buffer size:

1) serial input buffer

2) REPL-FIFO without SLIP, I did not even realize that I had unbuffered input. So the REPL can read char's at that speed, but presumably it does so into a line buffer To avoid cross stalling to MQTT, the REPL FIFO had to store as many chars as there may be unprocessed between two adjacent SLIP frames. I don't know, but given 80 chars as a normal line size I think 128 byte will be a good start

3) SLIP Message buffer

4) MQTT-Task-handler regimes My simple control switch app used the multitask feature - so we will want to support this. What options do we have?

But there are still too many open quetions for a final decision

wolfgangr commented 5 years ago

Let' start at my example https://github.com/wolfgangr/forthMQTT3pktControl/blob/m014/3pkt.fs#L27 Two tasks I've timed there to run periodically (call-every) in background, independent from each other and from the REPL continuing to accept user prompts in the foreground:

' pv-status. 2000 1 call-every 
' pv-check pv-loop-delay @ 0 call-every 

Similiar tasks may typically be implemented as MQTT events, triggered by some MQTT subcription message, instead of a fixed periodic schedule.

So let's dig deeper into the anatomy of the mecrisp-stellaris multitasking framework:

We find that multi-fs is a round-robin cooperative multitasking framework, where any task is coded as running forever, but handing control over to the next task in the list ('suspending') at wait states or some other programmed situations using a pausestatement.

timed extends this to single shot tasks which are called by systicked triggers and supposed to finish quickly each call.

multi-irq just adds calls to flag a task to wake or sleep at the next round robin turn, but itself returning immediately - typically to be called in irq-handlers.

To pin down the concepts with as little distractive complexity. we refer to Matthias Koch's "blinky" examples

All methods have their merits, whose pro's and con's are beyond consideration at this point. We'd just like to keep them all open as options for the implementation of MQTT event handlers.

However, all of them require some tiny code whith the message as parameter kicked off at successful parsing completion of a SLIP message.

So my idea were:

The cooerative task mechanisms then require some kind of message passing that might be triggered at the end of a systick routine that completes a message parse run. To avoid redundancy, we may leave its details to any later implementation and provide here just the interrupt-triggered kickoff. Quick finishing tasks (far below the 1 ms-limit) can be implemented right away. Longer or stalling tasks, or tasks that might require to maintain state or some larger data areas, may be handed over to one of the other task implementation variants.

The interrupt handler will then only leave the required message chunks and maybe some flag in some kind of message box. Any stalling / blocking / long runnig routines are forbidden as they may end up in system instability. Detail can be found in any decent multitasking tutorial.

A simple one we find here - search for "Multitasking on the Quick"

wolfgangr commented 5 years ago

======================== draft 2019/02/05 08:10:27 =================== Error handling Brainstorm what might go wrong and what could be good idea to do then

wolfgangr commented 5 years ago

so add buffering/dma auto fill

no doubt, DMA would have the great advantage that once it runs, we didn' have to care whith IRQ and timing and task management interference issues any more at a later point in time.

I see that I get distracted from by first goal quite a lot, but may be this is skd of nasty work that simply has to be done...

So I just tried to read into the basic nuts'n bolts of STM32 DMA we can attach both UART RX and TX to DMA, we have a ring buffer, we have byte size transfer - all great stuff.

What puzzles me is the first sentence on pg 278: "transfer addresses (in the current internal peripheral/memory address register) are not accessible by software."

That's a sad thing, because that's what would have made a ring buffer implementation quite straightforward.

My dry swimmers considerations for workarounds:

For RX, we might a use a buffer of size 2^n and let DMA_CNDTRx simply overrun. If me mask this counter by n bit (log2 of buffer size), we get the pointer offset to the last byte received - so we are done.

For TX, we must keep track of a write counter our selves. That's the writing part of the ring buffer, anyway. DMA_CNDTRx provides us with a "backlog" of bytes still to be transferred.

write-pointer - DMA_CNDTRx = read-pointer (sorry, no RPL...)

We have to increment this value every time we add to the buffer, and it gets decremented by DMA each time the UART is pulling data off.

========================

After scanning the STM manual sections for DMA and USART, I think

Essence:

wolfgangr commented 5 years ago

Live comes into the way. Customers are threatening whith orders (not for bits'n bytes, different business) Chance for another one not-yet-finished project under my desk :-\\

Refocus:

Top goal: get a working gadget that can run callbacks triggered by the receipt of previously configured MQTT subscription messages

required for basic funtionality:

required for quality functionaltiy:

nice add non feature:

wolfgangr commented 5 years ago

IRQ TX

looks good: both RX and TX are handled over IRQ-driven ring buffer: https://github.com/wolfgangr/forthMQTT3pktControl/blob/RX-TX-irq_ulize/uart1-irq.fs

next step is 'SLIP-decoder in the input stream' above I considered to run this once every ms from systick. Hower, irq-priority of systick is much higher than of UART, So I'd block reading during this processing and thus loose chars - bad idea. I could trigger at task in the task framework. But can this block my reading to the REPL? There are pause clauses in the keywaits, so this might trigger input procession in some correct sequence. And if a task is misbehaving? Then the REPL will freeze anyway. But if I wan't to switch do singletask, my SLIP is not processed - bad idea.

Can I include the slip-decoder in the IRQ-handler? This would save one buffer as well and thus precious ram. Do we have enough time? let's see....

wolfgangr commented 5 years ago

draftet some kind of state machine: https://github.com/wolfgangr/forthMQTT3pktControl/blob/e995135c562b4de3467b94137152f6cad1f735c8/slip-handler.fs#L90

hav been there a week ago :-((( https://github.com/jeelabs/esp-link/issues/430#issuecomment-460312871