earlephilhower / ESP8266Audio

Arduino library to play MOD, WAV, FLAC, MIDI, RTTTL, MP3, and AAC files on I2S DACs or with a software emulated delta-sigma DAC on the ESP8266 and ESP32
GNU General Public License v3.0
2.01k stars 432 forks source link

Delta-Sigma algo #544

Open kimstik opened 2 years ago

kimstik commented 2 years ago

It is not so clear for me what is algorithm used actually. It looks more like "delta" one. Suggestion is to use classical first order DS-modulator. It should be a bit faster and may give 6dB DNR improvement in my estimations (perhaps wrong, to be tested).

void DeltaSigma(int16_t sample[2], uint32_t dsBuff[8])
{
    int32_t sum = (((int32_t)sample[0]) + ((int32_t)sample[1])) >> 1;
    int32_t newSamp = Amplify(sum); //no reason to shift

    int oversample32 = oversample / 32; //no reason to divide here every function call, why not to store oversample32 directly?

    for (int j = 0; j < oversample32; j++) {
        uint32_t bits = 0;
        for (int i = 32; i > 0; i--) {
            bits <<= 1;
            if (cumErr < 0) {
                bits |= 1;
                cumErr += newSamp + INT16_MAX; //cumErr is raw int32_t
            } else {
                cumErr += newSamp + INT16_MIN;
            }
        }
        dsBuff[j] = bits;
    }
}

fft

kimstik commented 2 years ago

another point is to use higher order of modulation, which is simple

#define DS_SIZE 2
int32_t err[DS_SIZE] = {};

void DeltaSigma2(int16_t sample[2], uint32_t dsBuff[8])
{
    int32_t sum = (((int32_t)sample[0]) + ((int32_t)sample[1])) >> 1;
    int32_t newSamp = Amplify(sum);

    int oversample32 = oversample / 32;

    for (int j = 0; j < oversample32; j++) {
        uint32_t bits = 0;

        for (int i = 32; i > 0; i--) {
            int16_t feed;
            bits <<= 1;
            if (err[DS_SIZE-1] < 0) {
                bits |= 1;
                feed = INT16_MAX;
            } else {
                feed = INT16_MIN;
            }
            err[0] += newSamp + feed;
            err[1] += err[0]  + feed;
        }
        dsBuff[j] = bits;
    }
}
earlephilhower commented 2 years ago

To be honest, I wrote the 1-bit output from scratch without knowing exactly what it was (just applied what I knew about error diffusion dithering to the signal). Looking at the Wikipedia entry, it seemed like delta-sigma but it's been a while since I did any real DSP work so I could be off base. In any case, to my ears it sounded very good considering the low HW and SW requirements.

You're obviously very interested in this and done some work already. Would you like to throw a PR together, maybe including the small optimizations you've got for the inner loop as well as a configurable selection between codes? Or, if it's always a slam dunk (i.e. CPU is same or lower while providing closer-to-true output) then just replacing the conversion directly?

kimstik commented 2 years ago

Well.. It should be tested a bit;) I like this kind of time-critical optimizations. My expectation is to get more dry runtime code. I have to take time to prepare PR.

One more point is to use saturated arithmetic where is possible. function "int16_t Amplify(int16_t s)" is well hot. xtensa instruction "CLAMPS" may reduce twice function footprint.