schreibfaul1 / ESP32-audioI2S

Play mp3 files from SD via I2S
GNU General Public License v3.0
1.14k stars 290 forks source link

sending samples 1 by 1 consumes lots of cpu and smaller bugs - improvement included #754

Closed shbgr closed 3 months ago

shbgr commented 5 months ago

Hi. I did some experiments with the lib. Works great, but i was wondering about only sending chunks of 4 bytes only to the i2s-driver. My C-experiences are cumbersome and the same with GIT. But managed to get CPU-load down by a factor of 2 by sending 16 i2s-frames instead of the the current single frame in the lib. This is likely due to several checks and creating a semaphore in IDF-lib for each call to i2s_write.

My results on Core 1 for a 320kbit MP3, stereo, 48k sample rate, SD connected with SPI:

I have attached my modified lib (based on 3.0.8). Seems to run fine, but memory leaks and other codecs than mp3 were off my scope. So better double check. See also some notes incorporated on filters that might get wrong and playSample for 8-bit files. And i added an additional attenuation. Not relevant for the lib.

Maybe this helps for further improvement. Audiosh.zip

schreibfaul1 commented 5 months ago

Thanks for the input, I know that the current state is not ideal. I'll have a look at it in the next few days.

shbgr commented 5 months ago

Looks like you give it a try. Definitely needs some work for production use, but likely worth the trouble.

Some more thoughts:

Therefor:

But have a look at the impact first. Also had some changes regarding the end of the file (idv3.1) and 8-bit-playback, but thats minor.

schreibfaul1 commented 5 months ago

Yes, I have copied the essentials from your source code and integrated them into a branch. Now the test phase begins.

tueddy commented 5 months ago

I have just tested the branch with ESPuino. The CPU time consumed is significantly reduced with these changes!

Current master:

Before

send-I2S-multiple-samples branch:

After

This is a great improvement! Audio plays fine. Unfortunately, Bluetooth Source (audio_process_i2s) is not yet working...

schreibfaul1 commented 5 months ago

Hi tueddy, you're already further along than me. So far I've only put the suggestions from shbgr into a branch and tested whether it works, purely subjectively. The measurements look really good, that brings a lot of speed. Maybe it's possible to put the filters and volume before the MULTISAMPLE array, then the BT function would be preserved.

tueddy commented 5 months ago

I know this is in an early state, I just wanted to share the performance measurement..

shbgr commented 5 months ago

Great. I think you could even reduce the mp3play load if you play with the delays in the task. I had 15 or 20ms if the decoded mp3 buffer (valid samples) couldn’t be completely written to DMA. 1ms otherwise. I monitored this with the audio_sent callback. But definitely has headroom.

The 16*512 buffer holds more than 40ms of samples worst case (48k). Decode for 1152 samples (9 buffers) takes about 5ms@320kbps. If there are valid samples left we should have at least 14 full buffers at the end of the loop (37ms worst case).

Some adaptive delay could be useful here. no idea how much load other codecs than mp3 produce. But for mp3 something like 2/5/7/9 ms delay for each 128 frames in buffer should work well. No need to try when not much work is expected…

Could give the headroom to lower cpu speed or simply allow faster ftp transfer while playing.

schreibfaul1 commented 5 months ago

Previously, the system waited until all decoded data had been written to the DMA buffer. There are now two improvements:

It is no longer possible to transfer the samples individually to BT. This is only possible as a block.

@tueddy please check whether this is also possible in ESPuino:

void audio_process_i2s(int16_t* outBuff, uint16_t validSamples, uint8_t bitsPerSample, uint8_t channels, bool *continueI2S){

  log_i("validSamples %i, bitsPerSample %i, channels %i", validSamples, bitsPerSample, channels);

  for(int i = 0; i < validSamples; i++){
      if(channels == 2){
          // outBuff[i]      left channel
          // outBuff[i + 1]  right channel
      }
      if(channels == 1){
          // outbuff[i]  mono
      }
  }

  *continueI2S = true; // also output via I2S
}
tueddy commented 5 months ago

It should definitely be possible to change the audio_process_i2s function for the ESPuino project. Perhaps the Bluetooth Source process could also be optimized by processing several samples. I will test it the next days.

One question: The transferred samples remain the same as the previous version (volume applied etc.)?

schreibfaul1 commented 5 months ago

That looks good. If you can copy a block to BT instead of individual samples, this will save time again. audio_process_i2s is behind the equalizer and the volume control, so that won't be a problem. It would be nice if the BT function could remain in operation. In addition, with a sample rate of 22.1KHz you may be able to write each sample twice or ignore some samples at 48KHz.

tueddy commented 5 months ago

I made a quick test with Bluetooth Source and get basically the same samples as before. BT-Source seems to work fine so far.

Before:

// process audio sample extern (for bluetooth source)
void audio_process_i2s(uint32_t *sample, bool *continueI2S) {
    *continueI2S = !Bluetooth_Source_SendAudioData(sample);
}

New, without any optimization, just tested with channels=2, left/right might be swapped:

void audio_process_i2s(int16_t *outBuff, uint16_t validSamples, uint8_t bitsPerSample, uint8_t channels, bool *continueI2S) {
    Serial.printf("validSamples %i, bitsPerSample %i, channels %i\n", validSamples, bitsPerSample, channels);
    uint32_t sample;
    for (int i = 0; i < validSamples; i++) {
        if (channels == 2) {
            // stereo
            sample = (uint16_t(outBuff[i * 2]) << 16) | uint16_t(outBuff[i * 2 + 1]);
            *continueI2S = !Bluetooth_Source_SendAudioData(&sample);
        }
        if (channels == 1) {
            // mono
            sample = (uint16_t(outBuff[i]) << 16) | uint16_t(outBuff[i]);
            *continueI2S = !Bluetooth_Source_SendAudioData(&sample);
        }
    }
}

I get clear sound on the BT-headset & about 570 samples per call wich is very good for further optimization. A little glitch is conversion between signed / unsigned int16/32 but it can be managed! Best Dirk

schreibfaul1 commented 5 months ago

I have now transferred it to the Master, https://github.com/schreibfaul1/ESP32-audioI2S/pull/764

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.