CarlosDerSeher / snapclient

snapclient on ESP32
GNU General Public License v3.0
101 stars 14 forks source link

Chunk size > 20ms breaks playback for flac and opus #54

Open BenediktMueller91 opened 5 months ago

BenediktMueller91 commented 5 months ago

I had a go again with this repo after some while and this time around building and flashing worked mostly without issues using a ESP32-Wrover-B + MAX98357 DAC.

Only issue i still have now is that only PCM codec works. When using OPUS or FLAC the playback does not start.

Am i missing something?

Playback Log using codec PCM:

[...]
I (2539) OTA: idf.py build ; curl snapclient.local:8032 --data-binary @- < build/snapclient.bin
I (5622) SC: Found 192.168.1.16:1704
I (5629) SC: netconn connected
I (5631) SC: netconn sent hello message
I (5636) SC: Buffer length:  500
I (5636) SC: Latency:        0
I (5637) SC: Mute:           0
I (5637) SC: Setting volume: 44
I (5641) SC: pcm sampleformat: 48000:16:2
I (5727) PLAYER: player_setup_i2s: dma_buf_len is 720, dma_buf_count is 8
I (5728) c_I2S: DMA Malloc info, datalen=blocksize=2880, dma_buf_count=8
I (5733) c_I2S: APLL: Req RATE: 48000, real rate: 47999.961, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 12287990.000, SCLK: 1535998.750000, diva: 1, divb: 0
I (5745) GENERIC_BOARD: I2S0, MCLK output by GPIO0
I (5753) c_I2S: APLL: Req RATE: 48000, real rate: 47999.961, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 12287990.000, SCLK: 1535998.750000, diva: 1, divb: 0
W (5765) PLAYER: no pcm chunk queue created?
I (5770) PLAYER: created new queue with 7
I (5774) PLAYER: snapserver config changed, buffer 500ms, chunk 2880 frames, sample rate 48000, ch 2, bits 16 mute 0 latency 0
I (5838) SC: latency buffer full
I (6310) PLAYER: initial sync age: 2us, chunk duration: 60000us
[...]
(playing)

Playback Log using Codec FLAC:

[...]
I (2546) OTA: idf.py build ; curl snapclient.local:8032 --data-binary @- < build/snapclient.bin
W (5629) SC: No results found!
I (6631) SC: Lookup snapcast service on network
I (9729) SC: Found 192.168.1.16:1704
I (9734) SC: netconn connected
I (9736) SC: netconn sent hello message
I (9742) SC: Buffer length:  500
I (9742) SC: Latency:        0
I (9742) SC: Mute:           0
I (9745) SC: Setting volume: 50
I (9749) SC: fLaC sampleformat: 48000:16:2
I (9770) PLAYER: player_setup_i2s: dma_buf_len is 576, dma_buf_count is 4
I (9771) c_I2S: DMA Malloc info, datalen=blocksize=2304, dma_buf_count=4
I (9776) c_I2S: APLL: Req RATE: 48000, real rate: 47999.961, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 12287990.000, SCLK: 1535998.750000, diva: 1, divb: 0
I (9788) GENERIC_BOARD: I2S0, MCLK output by GPIO0
I (9796) c_I2S: APLL: Req RATE: 48000, real rate: 47999.961, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 12287990.000, SCLK: 1535998.750000, diva: 1, divb: 0
W (9808) PLAYER: no pcm chunk queue created?
I (9813) PLAYER: created new queue with 19
I (9817) PLAYER: snapserver config changed, buffer 500ms, chunk 1152 frames, sample rate 48000, ch 2, bits 16 mute 0 latency 0
(silence)

or

[...]
I (4048) OTA: idf.py build ; curl snapclient.local:8032 --data-binary @- < build/snapclient.bin
I (7131) SC: Found 192.168.1.16:1704
I (7138) SC: netconn connected
I (7141) SC: netconn sent hello message
I (7146) SC: Buffer length:  500
I (7146) SC: Latency:        0
I (7146) SC: Mute:           0
I (7147) SC: Setting volume: 50
I (7152) SC: fLaC sampleformat: 48000:16:2
(silence)

Playback Log using codec OPUS:

[...]
I (3538) OTA: idf.py build ; curl snapclient.local:8032 --data-binary @- < build/snapclient.bin
I (6621) SC: Found 192.168.1.16:1704
I (6628) SC: netconn connected
I (6630) SC: netconn sent hello message
I (6635) SC: Buffer length:  500
I (6635) SC: Latency:        0
I (6635) SC: Mute:           0
I (6636) SC: Setting volume: 50
I (6641) SC: Opus sample format: 48000:16:2

I (6646) SC: Initialized opus Decoder: 0
E (6650) SC: samples_per_frame: 960, pOpusData->bytes 1501, bytes 3840
E (6657) SC: Decode error : -2 

E (6711) SC: samples_per_frame: 960, pOpusData->bytes 1446, bytes 3840
E (6712) SC: Decode error : -2 

E (6770) SC: samples_per_frame: 960, pOpusData->bytes 1407, bytes 3840
E (6770) SC: Decode error : -2 

E (6828) SC: samples_per_frame: 960, pOpusData->bytes 1330, bytes 3840
E (6828) SC: Decode error : -2 

I (6842) SC: latency buffer full
E (6891) SC: samples_per_frame: 960, pOpusData->bytes 1566, bytes 3840
E (6891) SC: Decode error : -2
[...]
(silence)
CarlosDerSeher commented 5 months ago

@BenediktMueller91 please post your snapserver.conf, from your PCM log it looks like you have a high setting for chunk duration?

I (6310) PLAYER: initial sync age: 2us, chunk duration: 60000us

Your opus error message suggests this

/** Not enough bytes allocated in the buffer */
#define OPUS_BUFFER_TOO_SMALL -2

I just tested the master branch on my lyrat v4.3 board and don't see your issues with flac.

I (5676) SC: netconn connected
I (5678) SC: netconn sent hello message
I (5692) SC: Buffer length:  758
I (5692) SC: Latency:        0
I (5692) SC: Mute:           0
I (5692) SC: Setting volume: 100
I (5699) SC: fLaC sampleformat: 44100:16:2
I (5893) SC: latency buffer full
W (41832) PLAYER: pcm chunk queue not created
W (41833) PLAYER: pcm chunk queue not created
W (41834) PLAYER: pcm chunk queue not created
W (41837) PLAYER: pcm chunk queue not created
W (41842) PLAYER: pcm chunk queue not created
W (41847) PLAYER: pcm chunk queue not created
W (41852) PLAYER: pcm chunk queue not created
W (41857) PLAYER: pcm chunk queue not created
W (41863) PLAYER: pcm chunk queue not created
W (41872) PLAYER: pcm chunk queue not created
I (41875) PLAYER: player_setup_i2s: dma_buf_len is 576, dma_buf_count is 4
I (41879) c_I2S: DMA Malloc info, datalen=blocksize=2304, dma_buf_count=4
I (41888) c_I2S: APLL: Req RATE: 44100, real rate: 44099.988, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 11289597.000, SCLK: 1411199.625000, diva: 1, divb: 0
I (41900) LYRAT_V4_3: I2S0, MCLK output by GPIO0
I (41907) c_I2S: APLL: Req RATE: 44100, real rate: 44099.988, BITS: 16, CLKM: 1, BCK_M: 8, MCLK: 11289597.000, SCLK: 1411199.625000, diva: 1, divb: 0
W (41920) PLAYER: no pcm chunk queue created?
I (41925) PLAYER: created new queue with 28
I (41930) PLAYER: snapserver config changed, buffer 758ms, chunk 1152 frames, sample rate 44100, ch 2, bits 16 mute 0 latency 0
I (42612) PLAYER: initial sync age: 2us, chunk duration: 26122us

I am running on snapserver v0.27.0

CarlosDerSeher commented 5 months ago

OK, i just tested with 60ms chunk duration and then I see the same behavior. Must be related to chunk duration measurement in the esp32 client. I'll have to look into that. Please use 20ms for chunk duration in your snapserver.conf for now and it should work.

BenediktMueller91 commented 5 months ago

Thanks! Setting the chunk size to 20 did the trick. The player starts now at least. There is still some crackling left but i will have a look if i can find the source of that.

Here is my snapserver.conf if it is still of interest (the rest is default values and sources):

# Default sample format: <sample rate>:<bits per sample>:<channels>
sampleformat = 48000:16:2

# Default transport codec
codec = flac

# Default source stream read chunk size [ms].
#chunk_ms = 60
chunk_ms = 20

# Buffer [ms]
#buffer = 1000
buffer = 800

I am also running snapserver v0.27.0 on a archlinux box.

CarlosDerSeher commented 5 months ago

Unleast you have enabled psram, 800ms buffer time is too much. I use around 750ms but at a sample rate of 44100Hz, so for 48000Hz it will be lower without psram. But the module you are using should support psram, so enable it using menuconfig and 1000ms should also be ok

BenediktMueller91 commented 5 months ago

I based my configuration on the sdkconfig_PSRAM_lyrat_v4.3 and only changed to CONFIG_DAC_MAX98357=y and added my wifi configuration. So yes, PSRAM should already be in use.

I have to correct myself. The crackling is also present when using PCM, just as with FLAC. The audio starts out alright and then quickly starts to crackle - worsening over some time until there is heavy crackling/distortion and then stays that way.

I think this is not the right place to discuss this issue as it is not tied to the codec issue i had. Should i start a new issue or discussion on this?

CarlosDerSeher commented 5 months ago

Yes please

CarlosDerSeher commented 1 month ago

@badaix I am a bit lost here. Maybe it is a misunderstanding of how the protocol works and hopefully you can enlightening me. I logged some decoded chunks with their timestamps and sizes but somehow the numbers don't add up. Setting is 44100:2:16 and I use a chunk size of 40ms on the server. Here is an example of what I mean:

chunk timestamp[us] size in bytes
1 0 9216
2 26123 4608
3 78367 9216
4 104490 4608
5 156735 9216
6 182857 4608
7 235102 9216
8 261225 4608
9 313469 9216
10 339592 4608

Note that from chunk 1 to chunk 2 there is a difference of 26ms but the size is 9216. From my understanding if I calculate the duration from the size chunk 2 should actually have a timestamp of 52246, because (9216 / 4 / 44100) = 52.25ms ? What do I miss here?

badaix commented 1 month ago

Which codec are you using?

CarlosDerSeher commented 1 month ago

Flac. The latest implementation is currently local only as I have changed a few things. If you need to see the code too I'll have to upload.

Oh and snapserver is version v0.27.0

CarlosDerSeher commented 1 month ago

@badaix the strange thing is if I count backwards the numbers add up

Sorry to bother you but I am really stuck here. The documentation says it is the time at which the frame has been recorded. Does that mean the time when recording was finished or when it has been started.

badaix commented 1 month ago

It should be the time when it has been recorded. When using compression (like FLAC), the captured PCM data (e.g. 40ms) will be passed to the encoder. The encoder has his own latency, i.e. it needs a specific amount of data until it returns the compressed chunk. What's sent over the wire are always decodable chunks, flac needs 1152 frames, ~26.1224ms of audio, so 40ms will be passed to FLAC, 26ms will be encoded, a new timestamp for this chunk is computed, 14ms are still buffered in the encoder, the next 40ms are passed to the encoder => 52ms will be encoded, 2ms are buffered in the encoder, the next 40ms are passed to the encoder => 26ms of 42ms will be encoded, timestamped and sent, 16ms are buffered, ... This is why alternating sizes are received on the client side. But the chunks should still be timestamped with the record start time of the chunk. This is just a brain dumb of what I did ~8 years ago. I will check in more detail next week, when I'm back from vacation.