Open pschatzmann opened 5 months ago
What would you prefer using for time handling in case of going 64 bit, milliseconds, microseconds, a structure like tm_t, a combination?
Here are some particularities that I've noticed for each:
tm_t
, used by the protocol, direct member equivalence to struct timeval
used in gettimeofday(...)
. I've added some operators and methods to it for testing, but there may be difficult corner cases to consider when doing math (e.g. see discussion on timeval_subtract explanation), there's the risk of introducing hard to see bugs to account for.tm_t diff_to_server = (time_message.latency - (base_message.received - base_message.sent)) / 2;
I would favor using microseconds given these observations.
My preference is in the order you mentioned the options. I think humans can not distinguish any difference in microseconds, so I would not see any big advantage in going more detailed.
I'm getting sync between different clients including snapdroid with this commit 8f6d62efcae5772e187a3d442055adaac74952e6, by using the timestamp from SnapAudioHeader. The CONFIG_PROCESSING_TIME_MS
had to be set to a weird value of -1150
, while using a buffer of 2000ms in snapserver settings.
Not submitting a PR yet since SnapTimeSync is no longer used except for storing the initial delay, and most of the sync code referencing it would drop. I found it easier to use microseconds for now, and trying to reduce some jitter observed on successive latency messages. I'm using ESP32-S3, didn't get to test it yet but this may not work well on the slower ESP32, I remember that when the resampling was different than 1.0 there was stuttering on it.
With commit 89497331465a2ee8041018efbf52f7c8e04de35b it's syncing with snapdroid using a CONFIG_PROCESSING_TIME_MS
of around -240ms in my setup, independently of the buffer
setting in /etc/snapserver.conf.
When tested on ESP32 I had to disable logging in ResampleStream::setStepSize(float)
due to it adding to the processing time, but on ESP32-S3 it doesn't seem to interfere noticeably.
Edit: to be in sync, CONFIG_PROCESSING_TIME_MS
is around -240 with SnapProcessorRTOS
and +80 with SnapProcessorBuffered
. For this sync method, without a buffered processor for the encoded data the sound is garbled and the resample factor doesn't converge to 1.0.
Hi, I wrote a draft of a SnapProcessor that buffers the encoded chunks paired with their respective timestamps. I've noticed that if they're applied to SnapOutput (via writeAudioInfo) unpaired, the buffer acts as a constant time delay line, instead of being elastic, and it doesn't compensate for network hiccups. The commits are d5ff5095882cc1a8bfa7bed330a1fbac948a342f and e2b213a81f8f9e68016979197eeea91b46d813b2. I'm getting synchronization with snapdroid with a CONFIG_PROCESSING_TIME_MS
of 8, tested in various configurations, I2S and MCPWM I posted about before. There are limitations regarding memory and processing power, for example if chunk_ms is 10 there is more overhead than for 20ms, and an ESP32 won't catch up. The max compression is with Opus (didn't test other codecs), I can see a bandwidth of around 35 kB/s per client, but for some reason I have trouble fitting 2 seconds = 70kB in an ESP32 without PSRAM, it needs some optimizations here, the draft implementation is just using STL containers.
Discussed in https://github.com/pschatzmann/arduino-snapclient/discussions/15