int32 milliseconds overflow when time difference is over 24 days

pschatzmann commented 5 months ago

Discussed in https://github.com/pschatzmann/arduino-snapclient/discussions/15

^{Originally posted by **mcbp223** March 18, 2024} It seems I had the snapserver turned on for a long time, and now the time difference between server and client overflows when calculated in milliseconds, `2^31 / 1000 / 3600 / 24 = 24.855 134 81`. This can be overcome by calling `settimeofday(...)` when appropriate. I'm experimenting with the following: - changed the return of `SnapTime::toMillis(timeval)` and `SnapTime::timeDifferenceMs(...)` from uint32_t to int64_t. ```cpp int64_t toMillis(timeval tv) { return int64_t(tv.tv_sec) * 1000 + (tv.tv_usec / 1000); } ``` - in `SnapProcessor::processMessageTime()` I call 'settimeofday()' instead of assert, the part after the trx assignment is ```cpp int64_t time_diff = snap_time.timeDifferenceMs(trx, ttx); #if 1 && LOG_LOCAL_LEVEL >= ARDUHAL_LOG_LEVEL_INFO char buf[64]; strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", localtime(&ttx.tv_sec)); LOGI("time srv: %s.%06d", buf, ttx.tv_usec); strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", localtime(&trx.tv_sec)); LOGI("time cli: %s.%06d\tdiff: %lldms", buf, trx.tv_usec, time_diff); #endif // for time management snap_time.updateServerTime(trx); // for synchronization p_snap_output->snapTimeSync().updateServerTime(snap_time.toMillis(trx)); int32_t time_diff_int = time_diff; //assert(time_diff_int == time_diff); if(time_diff_int != time_diff) { LOGE("int32 time overflow %d != %lld", time_diff_int, time_diff); settimeofday(&ttx, NULL); } ESP_LOGD(TAG, "Time Difference to Server: %lld ms", time_diff); snap_time.setTimeDifferenceClientServerMs(time_diff); return true; ``` The log when set to show the time difference is: ```log [I] SnapProcessor.h : 485 - time srv: 1970-03-02 02:38:33.168752 [I] SnapProcessor.h : 487 - time cli: 1970-01-01 00:00:07.602949 diff: -5193505566ms [E] SnapProcessor.h : 498 - int32 time overflow -898538270 != -5193505566 ``` There's also a detail that can easily be overlooked, the printf format for int64_t is `%lld` with double 'l`.

mcbp223 commented 5 months ago

What would you prefer using for time handling in case of going 64 bit, milliseconds, microseconds, a structure like tm_t, a combination?

Here are some particularities that I've noticed for each:

milliseconds, already used, allows for direct math, somewhat easy to look at when debugging. As for disadvantages - it loses the microsecond precision present in the protocol and timeofday functions, needs a division by 1'000 when converting from them, that even when optimized can possibly take more CPU cycles than just multiplications.
microseconds, needs multiplication by 1'000'000 when converting from protocol seconds, likely faster than the division by 1'000 needed when using milliseconds; allows for direct math, easy to handle negative time differences. Hard to look at when debugging. Needs division by 1'000'000 when converting to seconds when sending Time messages, but it's not as often as when receiving Chunk messages.
tm_t, used by the protocol, direct member equivalence to struct timeval used in gettimeofday(...). I've added some operators and methods to it for testing, but there may be difficult corner cases to consider when doing math (e.g. see discussion on timeval_subtract explanation), there's the risk of introducing hard to see bugs to account for.

Here are some methods added to tm_t that would allow writing expressions like

```c++ struct tv_t { int32_t sec; int32_t usec; int64_t toMillis() const { return int64_t(sec) * 1000 + usec / 1000; } int64_t toMicros() const { return int64_t(sec) * 1000 * 1000 + usec; } tv_t& setMillis(int64_t ms) { sec = ms / 1000; usec = (ms - int64_t(sec) * 1000) * 1000; return *this; } tv_t& setMicros(int64_t us) { sec = us / 1000 / 1000; usec = us - int64_t(sec) * 1000 * 1000; return *this; } tv_t operator +(tv_t other) const { // from time.h timeradd(...) tv_t out; out.sec = sec + other.sec; out.usec = usec + other.usec; if(out.usec >= 1000*1000) { out.sec++; out.usec -= 1000*1000; } return out; } tv_t operator -(tv_t other) const { // from time.h timersub(...) tv_t out; out.sec = sec - other.sec; out.usec = usec - other.usec; if(out.usec < 0) { out.sec--; out.usec += 1000*1000; } return out; } tv_t operator /(int div) { tv_t out; out.sec = sec / div; out.usec = ((sec - out.sec * div) * 1000*1000 + usec) / div; while(out.usec >= 1000*1000) { out.sec++; out.usec -= 1000*1000; } return out; } }; ```

tm_t diff_to_server = (time_message.latency - (base_message.received - base_message.sent)) / 2;

I would favor using microseconds given these observations.

pschatzmann commented 5 months ago

My preference is in the order you mentioned the options. I think humans can not distinguish any difference in microseconds, so I would not see any big advantage in going more detailed.

mcbp223 commented 5 months ago

I'm getting sync between different clients including snapdroid with this commit 8f6d62efcae5772e187a3d442055adaac74952e6, by using the timestamp from SnapAudioHeader. The CONFIG_PROCESSING_TIME_MS had to be set to a weird value of -1150, while using a buffer of 2000ms in snapserver settings.

Not submitting a PR yet since SnapTimeSync is no longer used except for storing the initial delay, and most of the sync code referencing it would drop. I found it easier to use microseconds for now, and trying to reduce some jitter observed on successive latency messages. I'm using ESP32-S3, didn't get to test it yet but this may not work well on the slower ESP32, I remember that when the resampling was different than 1.0 there was stuttering on it.

mcbp223 commented 5 months ago

With commit 89497331465a2ee8041018efbf52f7c8e04de35b it's syncing with snapdroid using a CONFIG_PROCESSING_TIME_MS of around -240ms in my setup, independently of the buffer setting in /etc/snapserver.conf. When tested on ESP32 I had to disable logging in ResampleStream::setStepSize(float) due to it adding to the processing time, but on ESP32-S3 it doesn't seem to interfere noticeably.

Edit: to be in sync, CONFIG_PROCESSING_TIME_MS is around -240 with SnapProcessorRTOS and +80 with SnapProcessorBuffered. For this sync method, without a buffered processor for the encoded data the sound is garbled and the resample factor doesn't converge to 1.0.

mcbp223 commented 4 months ago

Hi, I wrote a draft of a SnapProcessor that buffers the encoded chunks paired with their respective timestamps. I've noticed that if they're applied to SnapOutput (via writeAudioInfo) unpaired, the buffer acts as a constant time delay line, instead of being elastic, and it doesn't compensate for network hiccups. The commits are d5ff5095882cc1a8bfa7bed330a1fbac948a342f and e2b213a81f8f9e68016979197eeea91b46d813b2. I'm getting synchronization with snapdroid with a CONFIG_PROCESSING_TIME_MS of 8, tested in various configurations, I2S and MCPWM I posted about before. There are limitations regarding memory and processing power, for example if chunk_ms is 10 there is more overhead than for 20ms, and an ESP32 won't catch up. The max compression is with Opus (didn't test other codecs), I can see a bandwidth of around 35 kB/s per client, but for some reason I have trouble fitting 2 seconds = 70kB in an ESP32 without PSRAM, it needs some optimizations here, the draft implementation is just using STL containers.

pschatzmann / arduino-snapclient

int32 milliseconds overflow when time difference is over 24 days #16

Discussed in https://github.com/pschatzmann/arduino-snapclient/discussions/15