espressif / esp-adf-libs

56 stars 41 forks source link

Use opus codec with downmix #16

Closed Yohannfra closed 3 years ago

Yohannfra commented 3 years ago

Hello, I try to mix opus files with the downmix api.

When the downmix is in ESP_DOWNMIX_WORK_MODE_BYPASS the sound works well but as soon as I add a second sound and start mixing (so the mixer in ESP_DOWNMIX_WORK_MODE_SWITCH_ON) it behave strangely, the first sound continues to play well but the second sound playin is really buggy/jerky.

Before switching to opus I tried with .wav files and it worked great, is downmixing even possible with opus ?

Here is the code I use for to init the mixer (c++)

esp_err_t MixerV2::init()
    // Get handle to audio codec chip
    _board_handle = audio_board_get_handle();

    // Create _pipeline_mix pipeline
    _pipeline_cfg = DEFAULT_AUDIO_PIPELINE_CONFIG();
    _pipeline_mix = audio_pipeline_init(&_pipeline_cfg);

    // Create down-mixer element
    downmix_cfg_t downmix_cfg = MY_DEFAULT_DOWNMIX_CONFIG();
    downmix_cfg.downmix_info.source_num = MAX_FILES_IN_DOWNMIX;
    _downmixer = downmix_init(&downmix_cfg);

    for (int i = 0; i < MAX_FILES_IN_DOWNMIX; i++) {
        downmix_set_input_rb_timeout(_downmixer, 0, i);
    }

    esp_downmix_input_info_t source_information[MAX_FILES_IN_DOWNMIX];

    esp_downmix_input_info_t source_info_base = {
        .samplerate = SAMPLERATE,
        .channel = 2,
        .bits_num = 16,
        .gain = {0, 0},
        .transit_time = 100,
    };

    float gains[MAX_FILES_IN_DOWNMIX][2] = {
        {0, -10},
        {-10, 0},
        {0, 0},
        {0, 0},
    };

    for (int i = 0; i < MAX_FILES_IN_DOWNMIX; i++) {
        source_info_base.gain[0] = gains[i][0];
        source_info_base.gain[1] = gains[i][1];
        source_information[i] = source_info_base;
    }
    source_info_init(_downmixer, source_information);

    // Create i2s stream to read audio data from codec chip
    _i2s_cfg = MY_I2S_STREAM_CFG_DEFAULT();
    _i2s_cfg.i2s_config.sample_rate = SAMPLERATE;
    _i2s_writer = i2s_stream_init(&_i2s_cfg);

    // Link elements together _downmixer-->i2s_writer
    audio_pipeline_register(_pipeline_mix, _downmixer, "mixer");
    audio_pipeline_register(_pipeline_mix, _i2s_writer, "i2s");

    // Link elements together _downmixer-->i2s_stream-->[codec_chip]
    const char *link_mix[2] = {"mixer", "i2s"};
    audio_pipeline_link(_pipeline_mix, &link_mix[0], 2);

    // Create resample element
    rsp_filter_cfg_t rsp_sdcard_cfg = DEFAULT_RESAMPLE_FILTER_CONFIG();
    rsp_sdcard_cfg.src_rate = SAMPLERATE;
    rsp_sdcard_cfg.dest_rate = SAMPLERATE;

    // Create Fatfs stream to read input data
    _fatfs_cfg = FATFS_STREAM_CFG_DEFAULT();
    _fatfs_cfg.type = AUDIO_STREAM_READER;

    // Create opus decoder to decode opus file
    _opus_cfg = DEFAULT_OPUS_DECODER_CONFIG();
    _opus_cfg.task_core = 1;

    // Create raw stream of base opus to write data
    raw_stream_cfg_t raw_cfg = RAW_STREAM_CFG_DEFAULT();
    raw_cfg.type = AUDIO_STREAM_WRITER;

    for (auto &sound : _sounds) {
        sound.rsp_filter_el = rsp_filter_init(&rsp_sdcard_cfg);
        sound.fatfs_reader_el = fatfs_stream_init(&_fatfs_cfg);
        sound.opus_decoder_el = decoder_opus_init(&_opus_cfg);
        sound.raw_write_el = raw_stream_init(&raw_cfg);
    }

    // Set up  event listener
    audio_event_iface_cfg_t evt_cfg = AUDIO_EVENT_IFACE_DEFAULT_CFG();
    _evt = audio_event_iface_init(&evt_cfg);

    for (auto &sound : _sounds) {
        sound.stream_pipeline = audio_pipeline_init(&_pipeline_cfg);
        mem_assert(_sounds.stream_pipeline);
    }

    // link all pipelines
    for (auto &sound : _sounds) {
        auto numAsStr = std::to_string(sound.index);
        const std::array<std::string, 4> link_tags_tmp = {
            "file_" + numAsStr,
            "opus_" + numAsStr,
            "filter_" + numAsStr,
            "raw_" + numAsStr,
        };
        const char *link_tags[4] = {
            link_tags_tmp.at(0).c_str(),
            link_tags_tmp.at(1).c_str(),
            link_tags_tmp.at(2).c_str(),
            link_tags_tmp.at(3).c_str(),
        };

        audio_pipeline_register(sound.stream_pipeline, sound.fatfs_reader_el, link_tags[0]);
        audio_pipeline_register(sound.stream_pipeline, sound.opus_decoder_el, link_tags[1]);
        audio_pipeline_register(sound.stream_pipeline, sound.rsp_filter_el, link_tags[2]);
        audio_pipeline_register(sound.stream_pipeline, sound.raw_write_el, link_tags[3]);

        audio_pipeline_link(sound.stream_pipeline, &link_tags[0], 4);
        sound.rb = audio_element_get_input_ringbuf(sound.raw_write_el);
        downmix_set_input_rb(_downmixer, sound.rb, sound.index);
        audio_pipeline_set_listener(sound.stream_pipeline, _evt);

       return ESP_OK;
    }

It works well and all when playing only one sound but not when I start a second one so idk where could be the issue.

Thanks,

Yohann

Yohannfra commented 3 years ago

SAMPLERATE is

#define SAMPLERATE 48000
Yohannfra commented 3 years ago

The issue was that we used a samplerate of 48000 and stereo sounds which is apparently too much for the downmix.

We fixed it by using mono sounds and switching the mixer to ESP_DOWNMIX_OUTPUT_TYPE_ONE_CHANNEL