V3 WiFi activity interferes with sound recording

dlparker commented 2 years ago

I have a program that records sound via the V3 microphone and sends it over a TCP connection. Most of the time all is well, however sometimes the program misses some of the mic data.

I have set up a github project containing a simple, minimal example. I contains two twatch programs and a python server. One program simply reads the sound data and records the data stats, and the other is a slightly modified version of the same code that sends the data over a TCP socket to the python server. The stats are reported, and it is pretty easy (at least in my environment) to see that the TCP version fails to read all the data in about 1 out of 5 recording cycles. To be clear, the read loop never sees the data, probably indicating that the task running that code is getting preempted by something WiFi related for long enough that the DMA buffers get overwritten with new data.

Here is the repo.

https://github.com/dlparker/mic_example

I hope that someone with better tools and knowledge can determine why this is happening, and hopefully even suggest tweaks (maybe some FreeRTOS configuration) that would help reduce this problem.

Also please note, I had a very hard time figuring out how to record good quality data using the microphone. Most of the examples I found on the web had issues, and the somewhat obscure nature of the problem made it hard to understand when starting out with no knowledge of the details.

I did get it to work well eventually. It turns out to be very simple once you know the right i2s setup details. Maybe someone could take a look at my example and add something using the same set up to the official examples?

dlparker commented 2 years ago

I added an new version of the test code repository that uses the AsyncTCP library, but is otherwise unchanged from the plain TCP version. It does not experience the dropped sound data problem. Although this represents a solution, it seems undesirable to have to use a third party library with added complexity just to send bytes down a net pipe. There is no back and forth chatter over the socket, just a simple startup handshake and then a stream of bytes one way. Seems like that is a really simple case, not to mention common if people want to do anything with the microphone, as there is not enough local storage to hold much recorded sound.

Maybe there is some tweaking that will get it done with the regular TCP library? Maybe different send buffer sizes? Maybe larger DMA buffers for i2s? Maybe some specific ratio between i2s read sizes and TCP send sizes? Anybody know enough about these components to advise me how to adjust things?

LilyGO commented 2 years ago

I'm sorry I can't give you effective advice on this detail. But the problem can persist until an experienced engineer comes up with a solution

dlparker commented 2 years ago

I've done a lot more work on this problem since I opened this issue. Various things help, such as doing the read/send loop in a task and pinning that task to a core, increasing the size of the buffer used for i2s_read, etc. However, it is never completely reliable at 32 bits per sample. At 16 bits per sample it is pretty easy to make it reliable. I can even get high but not perfect reliability sending over a secure Websocket, with the overhead of encryption causing an occasional small drop out, hardly noticeable.

The sound quality at 16 bits per sample is good enough for my use, which is voice recording. The loss quality due to the loss of 8 bits of resolution is not really noticeable. Surprising since that is 33% of the maximum resolution.

dlparker commented 2 years ago

There is a new version of my example code at https://github.com/dlparker/mic_example/tree/main/remote_task that does not use AsyncTCP but which does work reliably, if you choose 16 bits per sample. It is fairly reliable even with 32 bits per sample, you might have to run a lot of test loops before the problem shows up.

This version demonstrates that AsyncTCP is not necessary. This makes sense based on my understanding that the esp32 socket code is non-blocking, unlike the esp8288, so you shouldn't need the AsyncTCP library, ever. I think that the reason that it helped avoid the problem was due to my original code taking too much time to process data in between i2s_read calls, and might also have been getting interrupted by WiFi/TCP operations. Look at the README in the new version directory to get the details on what my experimenting and testing has convinced me is required for reliable operations.

Xinyuan-LilyGO / TTGO_TWatch_Library

V3 WiFi activity interferes with sound recording #155