pschatzmann / arduino-audio-tools

Arduino Audio Tools (a powerful Audio library not only for Arduino)
GNU General Public License v3.0
1.52k stars 235 forks source link

Incorrect conversion of uint8 WAV file in playback #1793

Closed vladkorotnev closed 7 hours ago

vladkorotnev commented 8 hours ago

Problem Description

Thanks for the nice project, it saved me in a pinch :-) (quickly upcycling an ESP32 with a fried LNA output into an audio toy for a friend's kid)

However I've encountered an issue with the resampling code: When playing a WAV file which is saved as 8 bits PCM via a FormatConverterStream, it is very noisy and distorted.

The reason seems to be because the samples are interpreted as signed 8 bit integers, which are not part of the WAV standard.

Device Description

ESP32 WROVER, but I think it's not device specific

Sketch

AnalogAudioStream out;
WAVDecoder decoder;

void init_wav() {
    AudioToolsLogger.begin(Serial, AudioToolsLogLevel::Error);
    auto config = out.defaultConfig();
    config.sample_rate = 44100; // <- for some reason ESP32 cannot start with any other sampling rate... I have this problem in my own projects too
    config.channels = 1;
    config.bits_per_sample = 16;
    out.begin(config);
}

void play_wav(const char * name) {
    File f = LittleFS.open(name);
    if(!f) return;

    EncodedAudioStream *s = new EncodedAudioStream(&f, &decoder);
    FormatConverterStream  *rss = new FormatConverterStream(*s);
    StreamCopy * copier = new StreamCopy(out, *rss);
    const AudioInfo from(16000, 1, 8);
    const AudioInfo to(44100, 1, 16);

    s->begin();
    rss->begin(from, to);
    copier->copyAll(1);

    f.close();
    delete rss;
    delete s;
    delete copier;
}

void setup() {
    if(!LittleFS.begin(false, "/disk", 128)) ESP_LOGE(LOG_TAG, "Mount fail");
    init_wav();
    play_wav("/test_8bit_16khz_mono.wav");
    vTaskDelete(NULL);
}

Other Steps to Reproduce

The following patch for AudioStreamsConverter.h fixes it, but breaks all other formats of course without proper integration:

378,379c379,386
<       NumberConverter::convertArray<TFrom, TTo>(
<           data_source, (TTo *)buffer.data(), samples, gain);
---
>       TTo* dest = (TTo *)buffer.data();
>       uint8_t *src = (uint8_t *)data;
>       for(int i = 0; i < samples; i++) {
>         int8_t sgn = ((int16_t)src[i]) - 127;
>         dest[i] = NumberConverter::clipT<int16_t>( ((int32_t)sgn * 256 - 128) * 3 );
>       }
>       // NumberConverter::convertArray<TFrom, TTo>(
>       //     data_source, (TTo *)buffer.data(), samples, gain);

What is your development environment

ESP32 + PlatformIO (espressif32 + arduino latest)

I have checked existing issues, discussions and online documentation

pschatzmann commented 7 hours ago

This is correct: the FormatConverter only supports signed data types and the output of 8 bit via the ESP32 I2S is even messier because it expects an int16_t where the signed audio byte is left shifted by 1 byte.

You can chain another EncodedAudioStream with a DecoderL8 which supports both signed and unsiged bytes to convert it into a proper stream of int16_t

vladkorotnev commented 7 hours ago

Oh, that's good to know. I have to admit this was on a whim and somehow didn't come up to this decoder via Google. So went with the good ol' peek and poke approach :-) Thanks for the pointer!

The esp was already decapitated and submerged into epoxy but it's good to know for later! Thanks so much!

P.S.

t expects an int16_t which is left shifted by 1 byte. The deeper I dive into this platform the more I am reassured it is cursed by design...