Closed jbrzusto closed 1 year ago
Thanks for capturing these details. Looking at these symptoms it seems like this may be related to the NTP sync issues described in issue #14 that were introduced in mDash 1.2.14 and very likely persist in 1.2.15
Would you be willing to repeat your tests using 1.2.13 to see if they go away? If so, this issue is likely a duplicate of #14
I tried this with 1.2.13, adding a print of the system time because no log message is generated in this version, and you're right, the system clock is set correctly.
@jbrzusto - Thanks for verifying! This is very useful. As noted in issue #14, it appears this bug was introduced in 1.2.14
@cpq - Can you please take a look at this and advise?
I'm hoping to eventually be able to submit a pull request on this issue (pending my company signing an NDA with Cesanta) but meantime, here is workaround for version 1.2.15, assuming you have a debug version of the library file, which is what has been released to date. Tested on ESP32. I can't offer support, but consider the file below released under whatever licence(s) is compatible with mongoose + mDash + your needs.
modify the binary libmDash.a like so (using the objcopy appropriate to your target platform; e.g. xtensa-esp32-elf-objcopy
for esp32):
objcopy --globalize-symbol mg_sntp_connect --globalize-symbol s_time_set --globalize-symbol s_sntp libmDash.a
objcopy --weaken-symbol mg_sntp_connect libmDash.a
This allows us to replace the built-in mg_sntp_connect
function (and its callees) with one that calls our corrected versions of those callees (namely mg_sntp_parse
and sntpcb
). The two variables s_time_set
and s_sntp
also need to be made global so our replacement code can use them.
copy the file mongoose.h
from the mongoose repo into your firmware source folder.
add the following new file, called sntp.c
to your firmware source folder, and add it to your build instructions, if needed. The functions my_mg_sntp_parse
and mysntpcb
replace the buggy mg_sntp_parse
and sntpcb
in the mDash library.
The functions mg_sntp_connect
and mysntp_cb
replace the ones in the library, due to the symbol weakening done above. These versions differ only in calling the my
versions of the functions, instead of the buggy ones in the library.
#include <stdint.h>
#include <string.h>
#include <stdio.h>
#include <sys/time.h>
#include "mongoose.h"
#define SNTP_INTERVAL_SEC 3600
#define SNTP_TIME_OFFSET 2208988800UL
extern uint32_t mg_ntohl(uint32_t net);
extern int s_time_set;
extern int s_sntp;
int64_t my_mg_sntp_parse(const unsigned char *buf, size_t len) {
int64_t res = -1;
int mode = len > 0 ? buf[0] & 7 : 0;
int version = len > 0 ? (buf[0] >> 3) & 7 : 0;
if (len < 48) {
MG_ERROR(("%s", "corrupt packet"));
} else if (mode != 4 && mode != 5) {
MG_ERROR(("%s", "not a server reply"));
} else if (buf[1] == 0) {
MG_ERROR(("%s", "server sent a kiss of death"));
} else if (version == 4 || version == 3) {
uint32_t *data = (uint32_t *) &buf[40];
unsigned long seconds = mg_ntohl(data[0]) - SNTP_TIME_OFFSET;
unsigned long fracseconds = mg_ntohl(data[1]);
res = ((int64_t) seconds) * 1000 + (((int64_t) (fracseconds) * 1000) >> 32);
} else {
MG_ERROR(("unexpected version: %d", version));
}
return res;
}
void mysntp_cb(struct mg_connection *c, int ev, void *evd, void *fnd) {
if (ev == MG_EV_READ) {
int64_t milliseconds = my_mg_sntp_parse(c->recv.buf, c->recv.len);
if (milliseconds > 0) {
mg_call(c, MG_EV_SNTP_TIME, (uint64_t *) &milliseconds);
printf("%u.%u", (unsigned) (milliseconds / 1000),
(unsigned) (milliseconds % 1000));
}
mg_iobuf_del(&c->recv, 0, c->recv.len); // Free receive buffer
} else if (ev == MG_EV_CONNECT) {
mg_sntp_request(c);
} else if (ev == MG_EV_CLOSE) {
}
(void) fnd;
(void) evd;
}
// reverse-engineered from object code and then corrected:
void mysntpcb(struct mg_connection *c, int ev, void *ev_data, void *fn_data ) {
switch (ev) {
case MG_EV_SNTP_TIME:
{
uint64_t ms = *(uint64_t *)ev_data;
struct timeval tv = {.tv_sec = ms / 1000, .tv_usec = 1000 * (ms % 1000)};
settimeofday(&tv, 0);
s_time_set=1;
}
break;
case MG_EV_CLOSE:
s_sntp = 0;
break;
}
}
struct mg_connection *mg_sntp_connect(struct mg_mgr *mgr, const char *url,
mg_event_handler_t fn, void *fnd) {
struct mg_connection *c = NULL;
if (url == NULL) url = "udp://time.google.com:123";
if ((c = mg_connect(mgr, url, &mysntpcb, fnd)) != NULL) c->pfn = mysntp_cb;
return c;
}
Closing - as moved to a recent Mongoose, and open sourced the lib.
Great news - thanks!
The function
sntpcb
(apparently - source code is not available) misinterprets the return value ofmg_sntp_parse
, This behaviour appears on both esp32 (see this mdash forum post) and linux-x64. Debugging on linux-x64 shows thatsntpcb
is discarding the upper 32-bit portion of themg_sntp_parse
return value (which is uint64_t count of milliseconds since the epoch) and using the lower 32 bits as the count of seconds since the epoch, leading to nonsensical system time.Below are the ugly linux-x64 debug details, but the behaviour is identical on esp32. The test app which I debugged:
State of relevant x64 registers upon leaving
mg_sntp_parse
:The correct (except for https://github.com/cesanta/mongoose/issues/1628) milliseconds since the epoch is in
rax
. But just before the call tomg_log
inside ofsntpcb
, here's what the registers hold:Notice that
rsi
holds the lower 32-bits (0xf6c2237f) of the return value frommg_sntp_parse
(0x181f6c2237f), and the upper 32-bits has apparently disappeared. Letting the code continue, we get this message:which has clearly used the incorrect values from registers
rsi
andrdx
.This bug is a show-stopper. Although the initial connection to the mDash websocket server appears to set the device clock correctly,
sntpcb
will eventually clobber it.