Open wolfgangr opened 5 years ago
Maybe I have a pointer alignment problem - among many others.
This is no surprise when apps are ported from 8 bit to 32 bit platform.
And obviously, the align
-word in my mecrisp is dysfunct.
This produced different hard to debug effects, among them the infamous
Unhandled Interrupt 00000003 !
crashes (see also here https://jeelabs.org/article/1619a/ )
But I also had obviusly simple definitions that did not work, and irrespossive connections between ESP-Link and mecrisp.
grepping through my source trees, I found
flib/spi/rf69ook.fs:113:OOK.RSSI.QLEN 2 rshift 1+ 2 lshift buffer: ook.rssi.q \ don't trust align
.
So it seems to be a known problem.
aligned
seems to work, calign
obvioulsy not.
My current workaround is
: myalign here dup aligned swap - allot ;
: align myalign ;
and after possibly uneven c,
sequences, I call
calign
myalign
I'll try to figure out whether it is a problem rooted in mecrisp or in the emmbello/flib
I include, and will rise an issue in the proper place then.
I hope this finding brings me a bit closer to a succesful FORTH-MQTT-link. Nevertheless, test sequences and/ or an API documentation for the SLIP-MQTT interface would still be appreciated. I'd propose to write some nice text, when I get the proper informations / assistance to do so.
just replaced my reset-wire by a diode as outlined here https://jeelabs.org/article/1619a/ and can do proper soft resets again. Nice :-)
This is not the primary cause, of the issue at hand, just a aggravation of symptoms, but may help to track down the cause of unplanned soft resets - at least I can read the last console printings just before a crash.
hacked a simple string lib to proceed would prefer to test and polish it before I publish it
Making progress - hitting visible walls already.
I manage it to get a message parsed by ESP-Link as supposed-to-be slip. At least I can see a related message in the web log. Maybe the mqtt stuff is not yet OK, but before I have to get the CRC right. This my sending string buffer as visible in FORTH The first two bytes resemble the counter, the next two the max length of the allocated string. So, the data start at ....085C
SLIP-message stringbuf-dump
Start at 20000858
Last byte is 089d, containing C0 aka SLIP_END
|-pos-|-len-|-data->
20000850 FF F7 33 FF 00 BD 00 00 42 00 80 00 00 00 00 01 ..3..... B.......
20000860 00 00 00 89 C0 00 00 00 0B 00 05 00 00 00 00 00 ........ ........
20000870 14 00 68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F ..home/b asement/
20000880 77 61 73 00 00 00 02 00 6F 6E 00 00 02 00 02 00 was..... on......
20000890 00 00 01 00 00 00 00 00 01 00 00 43 F4 C0 00 00 ........ ...C....
200008A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
This is what I read from the ESP-Link log. The start and the count suggest that only the bytes between the C0 aka SLIP_END are counted. However, the end of the messages differ - even in size
16530> SLIP: start or end len=9 inpkt=0
916531> SLIP: start or end len=56 inpkt=1
916531> SLIP: bad CRC, crc=b6dc rcv=f443 len=56
916531> \00\00\00\0B\00\05\00\00\00\00\00\14\00home/basement/was\00\00\00\02\00on\00\00\02\00\02\00\00\00\01\00\00\00\00\00\01\00\00C\F4
How is the CRC to be generated? I compared my hackaday example code with the source and identified this stanza: https://github.com/jeelabs/esp-link/blob/fe4f565fe83f05e402cc8d8ca3ceefbc39692a1f/serial/crc16.c
unsigned short
crc16_add(unsigned char b, unsigned short acc)
acc ^= b;
acc = (acc >> 8) | (acc << 8);
acc ^= (acc & 0xff00) << 4;
acc ^= (acc >> 8) >> 4;
acc ^= (acc & 0xff00) >> 5;
return acc;
this is my implementation in forth:
: crc+ ( old_running_sum new_byte -- new_running_sum )
xor
dup 8 rshift swap 8 lshift or
$ffff and
dup $ff00 and 4 lshift xor
$ffff and
dup 8 rshift 4 rshift xor
dup $ff00 and 5 rshift xor
;
Compared to the template, I added two $ffff and
masks to keep calcuation in the 16-bit-width.
I suppose that unsigned short
in C is supposed to be 16 bit as well, OK?
What else can be wrong?
I don't find any initialisation in the C-code, so I assume the accumulator will start with zero.
I found a CRC-Calculator on the web, but I coulld not match their algorithm descriptions.
https://crccalc.com/?crc=0+1&method=ascii
Any pointer?
could it be a serial communication issue? ran the examples at 460800 switching back to 115200 , ruined my test example, so I have not exactly sam strings. but.... what puzzles me is that the trail of the strings are responded differntly in both cases I list below
01 00 00 43 F4 C0
rcv=f443
\01\00\00C\F4
01 00 00 A3 C4 C0
rcv=c4a3
\01\00\00\A3\C4
That crc compute could be changed by different endianness in esp vs arm? One is big other is little See https://www.reddit.com/r/esp8266/comments/4ve8b3/what_is_the_endianess_of_the_esp8266/ and https://electronics.stackexchange.com/questions/183021/stm32f103c8xx-big-or-small-endian
On Tue, Jan 29, 2019 at 10:18 AM wolfgangr notifications@github.com wrote:
could it be a serial communication issue? ran the examples at 460800 switching back to 115200 , ruined my test example, so I have not exactly sam strings. but.... what puzzles me is that the trail of the strings are responded differntly in both cases I list below
- last 6 bytes sent
- crc reported as bad by ESP-Link log
last bytes as responded by ESP-Link log But obviously, the CRC arrived correctly at the ESP-Link - this would not point t a communicatin issue
01 00 00 43 F4 C0 rcv=f443 \01\00\00C\F4
01 00 00 A3 C4 C0 rcv=c4a3 \01\00\00\A3\C4
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jeelabs/esp-link/issues/430#issuecomment-458447678, or mute the thread https://github.com/notifications/unsubscribe-auth/Ab_a_QvEdYbwIspdL8D3I8UN08tC1ATmks5vIAPKgaJpZM4aU4pg .
I'm just thinking into endianness regarding the last step - putting the CRC result to the message string. But then I might just get two bytes swapped - easy to spot.
Both the C and my FORTH code are implemented in integer math. shouldn't assembly level bare metal prob's be covered by the compilers? The links you pointed me refer both to C Compiles and problem of some endianness macro not included correctly. So either my FORTH or ESP-Link may be compiled wrong and nobody realized? I'll try to figure that ....
The forth side is easy to examine.
just a little loop, shifting a bit through a 32 bit number.
Used the same lshift
word as in the crc implementation above
https://github.com/jeelabs/esp-link/issues/430#issuecomment-458353545
I think it behaves as expected - no jumping bits, bit just falls off at the left end
: shifter cr 1 40 1 DO i . 1 lshift dup hex. loop cr ; Redefine shifter. ok.
ok.
shifter
1 00000002 2 00000004 3 00000008 4 00000010 5 00000020 6 00000040 7 00000080 8 00000100
9 00000200 10 00000400 11 00000800 12 00001000 13 00002000 14 00004000 15 00008000 16 00010000
17 00020000 18 00040000 19 00080000 20 00100000 21 00200000 22 00400000 23 00800000 24 01000000
25 02000000 26 04000000 27 08000000 28 10000000 29 20000000 30 40000000 31 80000000 32 00000000
33 00000000 34 00000000 35 00000000 36 00000000 37 00000000 38 00000000 39 00000000
How can I inject a test code into ESP-Link implementation? I'd like to see some online CRC calculator like this one to compare https://crccalc.com/?crc=0+1&method=ascii
Are there examples of just one crc step, so that I could cross check my incremental bytewise updater? Hm, looks like I have to dig for a C hello world an paste the code in? But then I would just explore the C-build environment of my Workstation.
I think I'll try to untwine the SLIP wrapper layer from the MQTT inside. Just a test string with a few bytes inside, easy to compare.
But for the moment, live comes into the way....
If you-re not familiar with ESP development tools you can compile an arduino sketch for ESP8266 and print CRC intermediate results incrementally
On Tue, Jan 29, 2019 at 10:55 AM wolfgangr notifications@github.com wrote:
Are there examples of just one crc step, so that I could cross check my incremental bytewise updater? Hm, looks like I have to dig for a C hello world an paste the code in?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jeelabs/esp-link/issues/430#issuecomment-458458120, or mute the thread https://github.com/notifications/unsubscribe-auth/Ab_a_WpWXP9m9LI3imbC0j66QJN0dNZ8ks5vIAxkgaJpZM4aU4pg .
Your forth CRC code looks correct to me. You need to make sure that the new_byte
parameter really only has 8 bits.
Sadly I inherited a lot of that low-level stuff from some other project, so it's not documented, grr..
In terms of slip, this is where it happens: https://github.com/jeelabs/esp-link/blob/master/serial/slip.c
Each packet starts and ends with a SLIP_END (0xC0). The last two bytes are the computed CRC value, which is calculated over all the bytes except the last two. Note that any 0xC0 or 0xDB value needs to be escaped on the wire and is unescaped before calculating the checksum.
A packet consists of a 16-bit command followed by a 16-bit argument count and a 32-bit "callback value" (really an opaque token). This is all little endian.
Your best bet is probably to start with the SYNC command whose code is 0x0001 and which takes one arg in the value field, which must be non-zero, e.g., { 0xC0, 0x01, 0x00, 0x01, 0x00, 0x11, 0x11, 0x11, 0x11, crc-lo, crc-hi, 0xC0 }
You should get a response back { 0xC0, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, crc-lo, crc-hi, 0xC0 }
I found my error.
Silly oversight in the caller to the CRC:
What I wanted: DO I c@ crc+ LOOP
What I did: DO I crc+ LOOP
... in a loop cycling over string adresses. So I calculated the CRC of the adresses, not the content.
Now I got the message:
99215> SLIP: start or end len=9 inpkt=0
99215> SLIP: start or end len=56 inpkt=1
99216> cmdParsePacket: cmd=0 argc=2816 value=1280
99216> cmdExec: Dispatching cmd=NULL
I interpret it that way that I have got the SLIP thing right and can proceed with MQTT now.
To be sure, I changed the endiannes of my CRC added and got :
SLIP: bad CRC, crc=b6dc rcv=dcb6 len=56
Lesson learned: screw your CRC if you need to debug the SLIP content :-O ;-)
Making progress: first succesful MQTT :-) This is what I send: (fist byte at ...0934, last byte at ...0975 )
20000930 42 00 80 00 01 00 00 00 89 02 00 00 C0 0B 00 05 B....... ........
20000940 00 00 00 00 00 14 00 68 6F 6D 65 2F 62 61 73 65 .......h ome/base
20000950 6D 65 6E 74 2F 77 61 73 68 65 72 02 00 6F 6E 00 ment/was her..on.
20000960 00 02 00 02 00 00 00 01 00 00 00 00 00 01 00 00 ........ ........
20000970 00 00 00 C1 7E C0 00 00 00 00 00 00 00 00 00 00 ....~... ........
ESP-link debug log: As we see, there is another MQTT pub, starting at 860810 sending the esp-link status
855163> SLIP: start or end len=9 inpkt=0
855164> SLIP: start or end len=56 inpkt=1
855164> cmdParsePacket: cmd=11 argc=5 value=0
855164> cmdExec: Dispatching cmd=MQTT_PUB
855164> MQTT: MQTTCMD_Publish topic=home/basement/washer, data_len=2, qos=0, retain=0
855164> MQTT: Publish, topic: "home/basement/washer", length: 26
855165> MQTT: Send type=PUBLISH id=0000 len=26
860810> MQTT: Publish, topic: "esp-link", length: 45
860810> MQTT: Send type=PUBLISH id=00A9 len=45
860813> MQTT: Recv type=PUBACK id=04A9 len=4; Pend type=PUBLISH id=4A9
and this is my console running :~$ mosquitto_sub -h 192.168.X.Y -v -t '#'
both my test string and the esp-link status:
home/basement/washer on
esp-link {"rssi":-60, "heap_free":20008}
@tve
so it's not documented, grr..
That's a common issue in the FOSS world. I respect that if people give away their work for free, they don't feel obliged to do boring stuff ;-)
So let me add my 3 cents here. (Or should I better start a Wiki page ... ?)
A packet consists of a 16-bit command followed by a 16-bit argument count and a 32-bit "callback value" (really an opaque token). This is all little endian. Your best bet is probably to start with the SYNC command whose code is 0x0001 and which takes one arg in the value field, which must be non-zero, e.g., { 0xC0, 0x01, 0x00, 0x01, 0x00, 0x11, 0x11, 0x11, 0x11, crc-lo, crc-hi, 0xC0 }
so this breaks up as follows?
END 0xC0,
SYNC command 0x0001 ~~> in little endian = 0x01, 0x00,
arbitrary non-zero dummy arg 0x0001 ~~> 0x01, 0x00,
arbitrary dummy callback value 0x11, 0x11, 0x11, 0x11,
checksum crc-lo, crc-hi,
END 0xC0
You should get a response back { 0xC0, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, crc-lo, crc-hi, 0xC0 }
Have not yet screwed my forth interpreter to process responses coming along the same line I use to talk to it. But should not be that difficult, I expect.
What I see at the console is a �
which I can expand to 0xEF at my forth console
Presumably from the checksum?
When I process the answer, is it ok to consider anything enclose between two C0 as SLIP (well, if not excaped...) Can I rely that any SLIP rests on one single line? Or do line breaks loose their meaning between C0 ?
Note that any 0xC0 or 0xDB value needs to be escaped on the wire and is unescaped before calculating the checksum.
Good to know.
So this should go into the routine which assembles the command string.
If I did it in the definition of literals, it would be in the checksum and thus screw it.
I found $DB constant SLIP_ESC
compiled from my inherited template.
So this is the escape character?
Is this the correct way to do it then?
0xDB -> 0xDB 0xDB
0xC0 -> 0xDB 0xC0
And I suppose I have to remove it from responses as well, OK?
I'd like to keep things simple. Can I avoid nonprintble characters alltogehter?
OK, next let's dissect my first succesful washer test string.
This is how it looks in memory:
20000930 42 00 80 00 01 00 00 00 89 02 00 00 C0 0B 00 05 B....... ........
20000940 00 00 00 00 00 14 00 68 6F 6D 65 2F 62 61 73 65 .......h ome/base
20000950 6D 65 6E 74 2F 77 61 73 68 65 72 02 00 6F 6E 00 ment/was her..on.
20000960 00 02 00 02 00 00 00 01 00 00 00 00 00 01 00 00 ........ ........
20000970 00 00 00 C1 7E C0 00 00 00 00 00 00 00 00 00 00 ....~... ........
Ouh, should I push my git to github to refer to the code that did this? anyway...
My code template says 'Command to sync up the esp-link '
01 00 00 00 89 02 00 00
and calls it 'sync'
Is it the sync you referred to? It's not enclosed in "END", and there is no CRC?
Or is it just bullshit an silently ignored?
C0
according to wikipedia article in SLIP stands for 'frame end'.
Is this required to start the following MQTT, or to close the (maybe broken) sync before?
0B 00 05 00 00 00 00 00
is called mqtt.pramble
in my template, commented as '5 arguments, no callback'. So expanding your explanation, this is acutally a ESP-link SLIP command - to switch it to MQTT mode now - not part of the MQTT standard - correct?
I see 0x000B
as command an 0x0005
as argument(s?).
Is it an argument, or a counter of such?
If an argument, what's the meaning?
If a counter, what is counted then?
Next is a string literal, consisting of a 16-bit length 0x0014
= dec 20 and a 20 byte long home/basement/washer
no frills ascii string.
I think the slashes refer to MQTT subtopics, but the mosquitto manual is still on my to-be-read list.
14 00
68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72
The next one 02 00 6F 6E 00 00 02 00 02 00 00 00
I just copied from the template, which I paste here including the comments there:
: message.on
02 00 6f 6e 00 00 \ "on"
02 00 02 00 00 00 \ len of data
;
There is another one (not used in my first example) I see the string format similiar to above Is there 16-bit padding required by the MQTT protocol? And I see the 3rd bit in the second line increased - copy of the string len?
: message.off
03 00 6f 66 66 00 \ "off"
02 00 03 00 00 00 \ len of data
;
The last one I simply copied without any clue.
Well, I can google MQTT & quos & retain, of course
01 00 00 00 00 00 01 00 00 00 00 00
: qos.and.retain
01 00 00 00 00 00 \ qos = 0
01 00 00 00 00 00 \ retain = 0
;
I really would appreciate some simple manual on the standard. The standard itself in my eyes is just a 100 page collection of might and could. Where can I find answers such as regarding the assembly of those strings?
C1 7E
is the CRC checsum (0x7ec1) calculated as discussed above, in little endian notation
C0
is the frame end token again.
Ouh, should I push my git to github to refer to the code that did this? anyway...
voila: https://github.com/wolfgangr/forthMQTT3pktControl
preliminary hack of a buffered string library https://github.com/wolfgangr/forthMQTT3pktControl/blob/master/stringbuf.fs
the constants I referred to: https://github.com/wolfgangr/forthMQTT3pktControl/blob/master/mmq-const.fs
the hot spot - work in progress while I write in this issue https://github.com/wolfgangr/forthMQTT3pktControl/blob/master/mmq-tools.fs
@wolfgangr nice work, but why FORTH? @tve any hints for this implementation?
why FORTH?
FORTH provides - to my knowledge - the largest spread between extremes on a couple of dimensions:
So on this question you could blow every thread, and there are enough of these discussions around.
back to start again? - Well, hitting a different wall, now. Wrote some words to customize contents (topic, value) Looks basicaly the same as the working one above - just different string lengts.
20000BA0 70 47 00 00 44 00 80 00 01 00 00 00 89 02 00 00 pG..D... ........
20000BB0 C0 0B 00 05 00 00 00 00 00 15 00 68 65 61 74 69 ........ ...heati
20000BC0 6E 67 2F 70 72 65 73 73 75 72 65 2F 70 75 6D 70 ng/press ure/pump
20000BD0 00 02 00 6F 6E 00 00 02 00 02 00 00 00 01 00 00 ...on... ........
20000BE0 00 00 00 01 00 00 00 00 00 DC AE C0 00 00 00 00 ........ ........
20000BF0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
But does not work. No message received on the MQTT client. The ESP-Link console:
804117> SLIP: start or end len=8 inpkt=0
804117> SLIP: start or end len=58 inpkt=1
804117> cmdParsePacket: cmd=11 argc=5 value=0
804117> cmdExec: Dispatching cmd=MQTT_PUB
804118> E:M 28280
So obviously the command got processed by the SLIP layer and handed over to the MQTT_PUP layer.
But then it threw some kind of an error E:M 28280
looks like a long night of RTFS....
For reference, again the log in the case of success
855164> cmdParsePacket: cmd=11 argc=5 value=0
855164> cmdExec: Dispatching cmd=MQTT_PUB
MQTT: MQTTCMD_Publish topic=home/basement/washer, data_len=2, qos=0, retain=0
... and of current fail:
804117> cmdParsePacket: cmd=11 argc=5 value=0
804117> cmdExec: Dispatching cmd=MQTT_PUB
804118> E:M 28280
we have a https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c line 100:
void ICACHE_FLASH_ATTR
MQTTCMD_Publish(CmdPacket *cmd) {
line 140:
DBG("MQTT: MQTTCMD_Publish topic=%s, data_len=%d, qos=%d, retain=%d\n",
topic, data_len, qos, retain);
MQTT_Publish(client, (char*)topic, (char*)data, data_len, qos%3, retain&1);
So I think this stanza generates the successful debug message and just after calls the handover to the MQTT server.
However, I can't see where the fail message E:M 28280
is assembled.
Looks I have to search one wrapping layer above.
@tve
Sadly I inherited a lot of that low-level stuff from some other project, so it's not documented,
Is there any hope that this other project has some docu we could build upon?
I see this (C) https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_msg.h
Copyright (c) 2014, Stephen Robinson
but only referring the outgoing calls (ESP-link -> MQTT-Server) No (C) in the serial parsers.
--- edit --- https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt.c
* Protocol: http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html
* Copyright (c) 2014-2015, Tuan PM <tuanpm at live dot com>
* Modified by Thorsten von Eicken to make it fully callback based
OK ? - I found the correct standard aka haystack and just heve to look for the needle...
https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c contains some checks can't identify the current point of failure, but keep them in the log as relevant details for the whole picture:
105: if (cmdGetArgc(&req) != 5) return;
So the cmd-len param has always to be 5, no need to implement variations on that
113: if (len > 128) return; /
max size of topic + data + decoration, we may match this limit at the client size
120: uint8_t *data = (uint8_t*)os_zalloc(len+1); .... if (!data)
may I refer to that as 'out of memory error'
my ESP-link keeps sending mqtt state messages like
esp-link {"rssi":-56, "heap_free":20384}
So I can't see a reason why we can' allocate 128 bytes in 20 k ram yet.
I see no premature returns after this last check and the succesful debug message.
We find repeated calls to cmdPopArg(
referring to some of the parameters displayed in the DBG("MQTT: MQTTCMD_Publish
Is there some backdoor error handler in the cmdPopArg that might override out flow of command?
Is there some backdoor error handler in the cmdPopArg that might override out flow of command?
// Copy next arg from request into the data pointer, returns 0 on success, -1 on error
https://github.com/jeelabs/esp-link/blob/fe4f565fe83f05e402cc8d8ca3ceefbc39692a1f/cmd/cmd.h#L120
https://github.com/jeelabs/esp-link/blob/fe4f565fe83f05e402cc8d8ca3ceefbc39692a1f/cmd/cmd.c#L159
I don't see such backdoor.
There is a os_memcpy
but I suppose this cannot fail on allocated buffers.
It just returns 0 (success) or -1 (not enough args?? ) , so has to be checked by the caller
https://github.com/jeelabs/el-client
A prelimenary documentation for the library is available on ELClient API Doc.
404: Page not found The way back machine just shows an empty frame.
It refers to doygen. So the docu lives in here? https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp
hm... a first glance tells me that is a layer beyond what I'm looking for. I don't want to call those C-functions, but mimic their behaviour (maybe a tiny subset thereof)
@tve I think this were right what I try to reinvent here. Are you aware of any other copy of this prelim API? Maybe somewhere on your personal storage?
OK - hope disappointed - back to track.
who is printing E:M ? github says:
We couldn’t find any code matching 'E:M' in jeelabs/esp-link
who is calling MQTTCMD_Publish ? https://github.com/jeelabs/esp-link/blob/fe4f565fe83f05e402cc8d8ca3ceefbc39692a1f/cmd/handlers.c#L74
in Version 2.2.3 this line sits at 41: - 33 lines less Looks like a major rewrite. Does this relate to my question? Is it OK to continue my work on stable V2.2.3 ?
Thorsten, @tve could you please provide a quick answer on that?
'==============
this is a // Command dispatch table
.
So, who is processing that?
line 76: (in v2.2.3) : cmdGetCbByName(char* name)
which I read as get Callback by Name
no other hit in git seach for MQTT_PUB
... hm ... puzzling?
well, you don't call by strings within C, do you?
we can search for the integer index CMD_MQTT_PUBLISH on our command table as well .... and find.... https://github.com/jeelabs/esp-link/blob/fe4f565fe83f05e402cc8d8ca3ceefbc39692a1f/cmd/cmd.h#L49
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.h distill the context:
typedef enum {
.....
CMD_MQTT_PUBLISH, // publish a message
.....
} CmdName;
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.c#L83
static void ICACHE_FLASH_ATTR
cmdExec(const CmdList *scp, CmdPacket *packet) {
// Iterate through the command table and call the appropriate function
while (scp->sc_function != NULL) {
if(scp->sc_name == packet->cmd) {
DBG("cmdExec: Dispatching cmd=%s\n", scp->sc_text);
// call command function
scp->sc_function(packet);
return;
}
scp++;
}
DBG("cmdExec: cmd=%d not found\n", packet->cmd);
}
Let' recall our log:
804117> cmdExec: Dispatching cmd=MQTT_PUB
804118> E:M 28280
OK, so we have found the point BEFORE the error encurred.
further enquire scp->sc_function(packet);
we have the function argument cmdExec(const CmdList *scp, ...
defined here
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.h#L55
typedef struct {
CmdName sc_name; // name as CmdName enum
char *sc_text; // name as string
cmdfunc_t sc_function; // pointer to function
} CmdList;
hm .... Looping closed and overlooked the point?
Looping closed and overlooked the point?
OK, down again the rabbit hole and collect the crumbs on the way.
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/handlers.c#L41
{CMD_MQTT_PUBLISH, "MQTT_PUB", MQTTCMD_Publish},
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.c#L89
DBG("cmdExec: Dispatching cmd=%s\n", scp->sc_text);
// call command function
scp->sc_function(packet);
This resembles our last valid debug print, so we know for sure (+-) that MQTTCMD_Publish
is called on our packet.
So somewhere between
https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L101
MQTTCMD_Publish(CmdPacket *cmd) {
and ( the debug print in success case we are missing here)
https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L140
DBG("MQTT: MQTTCMD_Publish topic=%s, data_len=%d, qos=%d, retain=%d\n",
our problem must occur.
E:M means out of memory for alloc
need to find some readings on forth for uC. any link?
E:M means out of memory for alloc
thanks :-) so I look for alloc and double check the validity of its arguments?
readings on forth for uC. any link?
Many of them. start here: https://github.com/wolfgangr/forthMQTT3pktControl/tree/master Mecrisp I think is a good start, if your platform can handle it. Smaller µC may be limited to sth like eforth, but I haven't tried https://hackaday.com/2017/01/27/forth-the-hackers-language/ https://hackaday.io/project/16097-eforth-for-cheap-stm8s-gadgets
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.h#L27
typedef struct {
CmdPacket *cmd; // command packet header
uint32_t arg_num; // number of args parsed
uint8_t *arg_ptr; // pointer to ??
} CmdRequest;
just assignemnts, no alloc:
cmdRequest(&req, cmd);
if (cmdGetArgc(&req) != 5) return;
len = cmdArgLen(&req);
can this alloc fail?
if (len > 128) return; // safety check
uint8_t* topic = (uint8_t*)os_zalloc(len + 1);
Memory completely full? len is uint16_t, so there shold not be a problem with negative numbers.
cmdPopArg
does not alloc either - just os_memcpy
https://github.com/jeelabs/esp-link/blob/v2.2.3/cmd/cmd.c#L157
So we conlude that alloc fails on 128 bytes. Which means that maybe not our current command is the culprit, but some nasty stuff that happened earlier? Or a bug in ESP-Link? May I try a later version?
OK, lets compare the debug messages again: I see in the faulty case the first len=9 instead of len=8
855163> SLIP: start or end len=9 inpkt=0
855164> SLIP: start or end len=56 inpkt=1
855164> cmdParsePacket: cmd=11 argc=5 value=0
855164> cmdExec: Dispatching cmd=MQTT_PUB
855164> MQTT: MQTTCMD_Publish topic=home/basement/washer, data_len=2, qos=0, retain=0
804117> SLIP: start or end len=8 inpkt=0
804117> SLIP: start or end len=58 inpkt=1
804117> cmdParsePacket: cmd=11 argc=5 value=0
804117> cmdExec: Dispatching cmd=MQTT_PUB
804118> E:M 28280
Does this refer to the sync part? This is weird, since i send the same 8 bytes, closed by 0xC0, in both cases.
why do you reffer to v2.2.3, as we have 3.2.2x available?
The len issue does not appear to be reproducible.
I cant't see any trace of a memory problem in the status message:
esp-link {"rssi":-59, "heap_free":19824}
home/basement/washer off
[ .... unsuccesful try with E:M message in Log ..... ]
esp-link {"rssi":-62, "heap_free":19824}
lets compare the test strings byte by byte (top = good, bottom = bad)
01 00 00 00 89 02 00 00 C0 0B 00 05 00 00 00 00 00 14 00 68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72 03 00 6F 66 66 00 02 00 03 00 00 00 01 00 00 00 00 00 01 00 00 00 00 00 8F 9E C0
01 00 00 00 89 02 00 00 C0 0B 00 05 00 00 00 00 00 16 00 68 65 61 74 69 6E 67 2F 70 72 65 73 73 75 72 65 2F 76 61 6C 76 65 03 00 6F 66 66 00 02 00 03 00 00 00 01 00 00 00 00 00 01 00 00 00 00 00 53 5F C0
CONFUSED - go for a beer...
ok but first is :
home/basement/washer
heating/pressure/valve
fooo/tralala/pipapox
is there another string length dependent value in the prefix?
whab about the funny 89
in the 5th byte?
do we have to pad the topic to 4 byte,not 2?
home/basement/washer off
home/basement/wasABCDher off
that' what it looks like...
typedef struct PACKED {
uint16_t cmd; /**< Command to execute */
uint16_t argc; /**< Number of arguments */
uint32_t value; /**< Callback to invoke, NULL if none; or response value */
uint8_t args[0]; /**< Arguments */
} ELClientPacket; /**< Packet structure */
#define SLIP_END 0300 /**< Indicates end of packet */ // 0xC0
#define SLIP_ESC 0333 /**< Indicates byte stuffing */ // 0xDB
#define SLIP_ESC_END 0334 /**< ESC ESC_END means END data byte */ // 0xDC
#define SLIP_ESC_ESC 0335 /**< ESC ESC_ESC means ESC data byte */ // 0xDD
@tve
@wolfgangr first try to use ELClient library with debug on, to dig into Mqtt implementation
it looks like 01 00 00 00 89 02 00 00 is not a slip pkt or malformed one
pkt:
C0 0B 00 05 00 00 00 00 00 14 00
68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72 03 00
6F 66 66 00 02 00 03 00 00 00 01 00 00 00 00 00 01 00 00 00 00 00 8F 9E
C0
C0 //slip start
0B 00 //CMD_MQTT_PUBLISH
05 00 // 5 args
00 00 00 00 // callback - none
14 00 // 1st = topic length
68 6F 6D 65 2F 62 61 73 65 6D 65 6E 74 2F 77 61 73 68 65 72 03 00 // topic = home/basemen /washer
6F 66 66 00 // 2nd off // here should be 03 00 ?????
02 00 //
03 00 // 3rd ??
00 00
01 00 // 4th qos
00 00 // 0
00 00
01 00 // 5th retain
00 00 // 0
00 00
8F 9E //crc
C0 //end
something is wrong, arg are like: length, arg
Changed my code to 32-bit padding of topic and can successfully send those topic strings:
home/basement/washer off
heating/pressure/valve off
home/basement/wasABCDher125 off
home/basement/wasABCDher125 off
home/basement/wasABCDher125XY off
I suspect that this zalloc
https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L163
uint8_t* topic = (uint8_t*)os_zalloc(len + 1);
fails if it gets arguments that are not multiples of four
So somewhere between https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L101
MQTTCMD_Publish(CmdPacket *cmd) {
and ( the debug print in success case we are missing here) https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L140DBG("MQTT: MQTTCMD_Publish topic=%s, data_len=%d, qos=%d, retain=%d\n",
our problem must occur.
just comment this line for tests.
you can change it: zalloc -> alloc + memset,0
So let's proceed to the rest of the MQTT message, which at the moment reads
03 00 6F 66 66 00 02 00 03 00 00 00 01 00 00 00 00 00 01 00 00 00 00 00
Educated guess recommends it to split this into 4 equal chunks: Since we have
03 00 - 6F 66 66 00 \ "off"
02 00 - 03 00 00 00 \ len of date
01 00 - 00 00 00 00 \ quos = 0
01 00 - 00 00 00 00 \ retain = 0
852517> MQTT: MQTTCMD_Publish topic=heating/pressure/pump, data_len=3, qos=0, retain=0
852518> MQTT: Publish, topic: "heating/pressure/pump", length: 28
852518> MQTT: Send type=PUBLISH id=0000 len=28
The meaning of quos and retain I'll expect in some decent MQTT documentation. Lets hope that for our task at hand, it will to be handled in an transparent way. Im not sure about the "data_len" - doesn't make sense yet to me. And I don't see where the Id value is coming from. Maybe it is created automagically in ESP-Link by quos and retain?
for me it looks like a bug, as before argument ("off") should be data_len (3) @tve could you explain?
https://github.com/jeelabs/el-client/blob/master/ELClient/ELClientMqtt.cpp#L173
.... some decent MQTT documentation.
When this is the low level of an MQTT packet http://www.steves-internet-guide.com/mqtt-protocol-messages-overview/ this does not ring many bells here.
So we have to assume that our SLIP-enclosed protocol is physically quite different from MQTT and translated in ESP-Link, implementing some kind of OSI 7 layer "MQTT-proxy"
I'd roughly guess from what I'v seen 'till now:
Let's collect pointers for reference:
retained refers to as single flag that tells the server to store a message and send it to new clients immediately. I hope a simple 0 | 1 will do the job. If not, we have to RTFS.
There are three levels of quality of service:
So, here, too, let's try if a simple integer byte value works.
Ah - let's have a look at the result of the parsing endeavour:
https://github.com/jeelabs/esp-link/blob/v2.2.3/mqtt/mqtt_cmd.c#L143
MQTT_Publish(client, (char*)topic, (char*)data, data_len, qos%3, retain&1);
The last three values are processed by cmdPopArg
data_len
and 1 byte for quos
and retain
Inspired by this hack https://hackaday.com/2017/02/13/hacking-on-the-weirdest-esp-module/#comment-5820116 I tried something similiar: MQTT from an STM32 Bluepill running mecrisp FORTH, connected to ESP-Link.
However, when I try to issue test strings, the communication between ESP-LINK and the STM32 breaks down. I have to reset both the µC at http://192.168.1.88/console.html and ESP-Link at http://192.168.1.88/log.html using the HTML-Buttons at those pages.
I suspect that some wrong command string confuses ESP-LINK and renders it in a non-responsive state.
I assume that the communication ESP-LINK <-> µC is neither pure MQTT nor SLIP but a dedicated protocol, correct?
Unfortunately, there is no MQTT in https://embello.jeelabs.org/flib/ yet.
I have repeatedly searched the issues and all docs for some spec, but the only source I found is the https://github.com/jeelabs/el-client/blob/master/ELClient/examples/mqtt/mqtt.ino However, assembling test strings from reverse engineering this very error prone, too.
It would be great to find a couple of preassembled test strings to ensure that the whole setup is working. So I could work down to more complicated functions by small changes. Just some "hello" appearing on my mqtt would be great.
Details from my setup:
esp-link {"rssi":-60, "heap_free":19992}
on my mosquitto_pub, so the ESP-link <-> mosquitto is workingI'd plan to rework the whole thing using string buffer (hope to find such in mecrisp) instead of blowing the stack. Would be glad to share, of course.