espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.35k stars 7.21k forks source link

I2S PDM micro on ESP32-S3 has only very low amplitude (IDFGH-7043) #8660

Closed BitSalat closed 2 years ago

BitSalat commented 2 years ago

We are trying to run the IS2 recorder example on an ESP32-S3 connected to a Vesper VM3011 PDM microphone.

Issue is that we only get a VERY small amplitude from the I2S port although the mic is set to max gain.

No chance to improve the behaviour with changing I2S parameters like sampling rate etc.

We suspect a lack of S3 support for this specific I2S - PDM functionality.

L-KAYA commented 2 years ago

Hi @BitSalat , Could you please provide more information? Like the IDF version, I2S configuration, how it is recorded and played, does same issue appear on ESP32?

BitSalat commented 2 years ago

Hi @L-KAYA

we are using IDF-Version: 4.4.

Base is derived from examples/peripherals/i2s/i2s_audio_recorder_sdcard/, We've changed it to sdmmc mode instead of SPI mode.

i2s_config_t i2s_config = {
    .mode = I2S_MODE_MASTER | I2S_MODE_RX | I2S_MODE_PDM,
    .sample_rate = 96000,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,
    .communication_format =I2S_COMM_FORMAT_STAND_MSB,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 8,
    .dma_buf_len = 200,
};

The same issue appears on ESP32. The sample rate has no influence on the amplitude.

L-KAYA commented 2 years ago

The configuration looks good.

There are indeed some PDM configurations that not public in v4.4 due to the driver limitation, but those configurations are related to the filters of PDM TX mode.

The only configurable parameter for PDM RX is down-sampling rate, you can set it to I2S_PDM_DSR_16S by i2s_set_pdm_rx_down_sample, but I'm not sure if it can affect the amplitude.

BitSalat commented 2 years ago

Thanx for investigating however variations of down-sampling rate does not change anything on the amplitude.

As we intend to use the Vesper VM3011 which is the only mems mic with wakeup trigger we need the PDM ifce working.

Can u propose any further tests we could do?

L-KAYA commented 2 years ago

Another feature on ESP-S3 is PCM compress mode, since the input PDM signal will be transferred to PCM format in hardware, choose a PCM compress mode may help to zoom in the small signal, a-law and u-law can be selected by i2s_pcm_config. If it not works, the only way might be applying the gain to the data that I2S read.

BitSalat commented 2 years ago

ok, we tried this and have a quite different result. Our setup is:

i2s_config_t i2s_config = { .mode = I2S_MODE_MASTER | I2S_MODE_RX | I2S_MODE_PDM, .sample_rate = 16000, .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT, .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT, .communication_format =I2S_COMM_FORMAT_STAND_PCM_SHORT, .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1, .dma_buf_count = 8, .dma_buf_len = 200, .fixed_mclk = 768000, };

i2s_pcm_cfg_t pcm_config = { .pcm_type = I2S_PCM_A_COMPRESS, };

Signal inside the PCM file now is much higher, indeed it's clipping and extremly distorted and hard to understand what was said ("test, test ...")

image

BitSalat commented 2 years ago

Accidentially closed

L-KAYA commented 2 years ago

No need to set communication_format to PCM_SHORT, a MSB format should be OK.

And also not suggest to set a fixed_mclk, PDM has its own rule to output the clock.

BitSalat commented 2 years ago

@L-KAYA thanx again for your investigations. We finally found an issue with the file header which led to a wrong sample rate when opening the file with Audacity. For any reason we had to set the sample rate to 32kHz while had to write 16kHz into the file header. However it's working now for our needs and VM3011 PDM micro is ok.

For those running in similar issues here's our setup:

const` char set_wav_header[] = {
        'R','I','F','F', // ChunkID
        file_size, file_size >> 8, file_size >> 16, file_size >> 24, // ChunkSize
        'W','A','V','E', // Format
        'f','m','t',' ', // Subchunk1ID
        0x10, 0x00, 0x00, 0x00, // Subchunk1Size (16 for PCM)
        0x01, 0x00, // AudioFormat (1 for PCM)
        0x01, 0x00, // NumChannels (1 channel)
        16000, 16000 >> 8, 16000 >> 16, 16000 >> 24, // SampleRate
        byte_rate, byte_rate >> 8, byte_rate >> 16, byte_rate >> 24, // ByteRate
        0x02, 0x00, // BlockAlign
        0x10, 0x00, // BitsPerSample (16 bits)
        'd','a','t','a', // Subchunk2ID
        wav_size, wav_size >> 8, wav_size >> 16, wav_size >> 24, // Subchunk2Size
    };

  i2s_config_t i2s_config = {
        .mode = I2S_MODE_MASTER | I2S_MODE_RX | I2S_MODE_PDM,
        .sample_rate = 32000,
        .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
        .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,
        .communication_format = I2S_COMM_FORMAT_STAND_MSB,
        .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
        .dma_buf_count = 8,
        .dma_buf_len = 200,
//.fixed_mclk = 768000,
    `};
remyhx commented 2 years ago

Hi I had the problem after an update (Use espidf 5) that my recordings with the VM3011 were 2x. Had to add i2s_set_pdm_rx_down_sample(I2S_NUM_0,I2S_PDM_DSR_8S);

If I analyze the recordings, the gain looks like very small. Understood I can't use PCM compress mode on a regular ESP32.

BitSalat commented 2 years ago

Hi @remyhx - we are still struggling with the low gain, probably 4x too low.

Indeed we have set .pcm_type = I2S_PCM_A_COMPRESS - what have you chosen instead?

BTW we are running on an ESP32-S3

remyhx commented 2 years ago

@BitSalat the esp32 isn't able to use these modes. Even when i shout loudly in front of the VM3011 it is not able to show enough gain. Also I have a +/- 11000 khz noise peak.

Yesterday I installed the updated (beta) version of the i2s lib (idf master branch, i2s_pdm) and I noticed a better noise peak, but maybe an even (slightly) lower amplitude. But I didn't measure it yet.

remyhx commented 2 years ago

@BitSalat I found a solution for the gain, by just multiplying the i2s stream values with a fixed number. You can use WOS_PGA_GAIN to adapt to a nice constant (depending the surround dB level)

BitSalat commented 2 years ago

@remyhx - many thanx for your support, just did not find constant WOS_PGA_GAIN, can u please link me to the right driver? We came to a similar conclusion and did the multiplication just this way:

// Start recording while (flash_wr_size < flash_rec_time) { // Read the RAW samples from the microphone i2s_read(CONFIG_EXAMPLE_I2S_CH,i2s_readraw_buff, SAMPLE_SIZE, &bytes_read, 100); // scale the data (otherwise the sound is too quiet) for (int x = 0; x < SAMPLE_SIZE/2; x++) { *i2s_readraw_buff[x] = (i2s_readraw_buff[x]) 8;** } // Write the samples to the WAV file fwrite(i2s_readraw_buff, 1, bytes_read, f); flash_wr_size += bytes_read; } // ESP_LOGI(TAG, "Recording done!"); fclose(f); ESP_LOGI(TAG, "File written on SDCard");

This works fairy well although I think we could increase the multiply to 10 or 12.

remyhx commented 2 years ago

@BitSalat no problem and also my thanx, because problems/experience on the VM3011 is scarce at the moment.

I use this function:

void varyGain(uint8_t * buf, int16_t gain)
{
    int16_t tempIS, tempUS;
    float value;

    for (int i = 0; i<(READ_BUFF_SIZE/2); i++) {
        tempIS = buf[i*2+1] << 8 | buf[i*2];
        value = tempIS/32768.00000;
        value = value*gain; 
        if (value>1) value = 1;                   //clipping
        if (value<-1) value = -1;
        tempUS = value*32768;

        buf[i*2] = 0x00FF&tempUS;
        buf[i*2+1] = tempUS>>8;
    }
}

The chosen gain can be 1-40 easily, VERY depending the background noise level. That's where the WOS_PGA_GAIN comes in. Currently I am in the process of making a y = ax + b like function to adapt to that WOS_PGA_GAIN number. This number can be found in the VM3011 datasheet, page 12 in the table at address 0x1 bit B0-B4. (You can also determine which levels it may use address 0x2) The only drawback is the I2C com is hearable as a short tick. So you want to use it as much as possible, without disturbing the audio too much.

BitSalat commented 2 years ago

@remyhx Although we can achieve a nice amplitude with multiplying samples I am still wondering why it isn't possible to see a full amplitude without that "trick". We will now check first if the PDM mics (we are using Adafruit as well as VM3011) really come up to full PDM pulse widths to see if something is wrong there or at the point of digitizing inside the I2S ifce. We need sound samples for inferenceing against a TFLite model and therefore need every bit of the sound with as less as possible noise floor. I am afraid that just pushing up the gain will also push the noise which would be not ideal. BTW we are still on IDF version 4.4.1 as 5xx still has some issue.

remyhx commented 2 years ago

I just send Vesper a message, hopefully they will reply to the small user :) (although I think this mic could become very popular for adafruit etc development boards / esp32 boards)

For my case noise is ok, but it feels like repairing a brand new product by software. Seen how they developed this product I think improvement should be possible and I just do something wrong.

I have to say sound quality for me is better with the new master branch, but temporarily I have to record in stereo to access the left mic channel (know bug at this moment, only right channel for mono). But because it's in development stage it's a small issue for now.

remyhx commented 2 years ago

Update. Second question I asked to Vesper; they appear to not care or read their mail...

BitSalat commented 2 years ago

@remyhx - does not sound very responsive from Vesper but anyway we have also tested the Adafruit PDM micro which yields similar results. So most likely the issue is inside the PDM converter on ESP32 not on the mic side. We consider to use an external PDM to I2S converter like the ADAU7002 which also has a power down mode so we can use in our planned battery powered device.

davidallenmann commented 2 years ago

We are using a Vesper VM3000 MEMS microphone with ESP32-S3.

We find the same issue with the clock being 2x too slow with the v4.4 I2S. It does work properly in the v5.0 IDF. But, there is clearly a bug in the I2S driver with the sample rate using the ESP32-S3 and PDM microphones.

The signal levels we see with the VM3000 are adequate for speech level sounds, and I don't think what you are seeing is an ESP32 issue or even a microphone issue. It is just the sensitivity of the microphone. The spec sheet for the VM3000 shows -26dB full scale for a 94 dB signal, and a 120 dB acoustic overload point. 120 dB is quite loud, and with a 16-bit microphone you theoretically can record sounds 96dB below that. But, you start running into electrical noise issues if you want to record sounds around 40-50 dB.

If you record in stereo you can also do things like average the two channels together to reduce the noise floor a bit.

remyhx commented 2 years ago

@david thank you! Just curious: what is the sampling problem you encounter in v5.0?

Outlook voor iOShttps://aka.ms/o0ukef downloaden


Van: David Mann @.> Verzonden: Thursday, July 14, 2022 12:28:35 AM Aan: espressif/esp-idf @.> CC: Remy Hurx @.>; Mention @.> Onderwerp: Re: [espressif/esp-idf] I2S PDM micro on ESP32-S3 has only very low amplitude (IDFGH-7043) (Issue #8660)

We are using a Vesper VM3000 MEMS microphone with ESP32-S3.

We find the same issue with the clock being 2x too slow with the v4.4 I2S. It does work properly in the v5.0 IDF. But, there is clearly a bug in the I2S driver with the sample rate using the ESP32-S3 and PDM microphones.

The signal levels we see with the VM3000 are adequate for speech level sounds, and I don't think what you are seeing is an ESP32 issue or even a microphone issue. It is just the sensitivity of the microphone. The spec sheet for the VM3000 shows -26dB full scale for a 94 dB signal, and a 120 dB acoustic overload point. 120 dB is quite loud, and with a 16-bit microphone you theoretically can record sounds 96dB below that. But, you start running into electrical noise issues if you want to record sounds around 40-50 dB.

If you record in stereo you can also do things like average the two channels together to reduce the noise floor a bit.

— Reply to this email directly, view it on GitHubhttps://github.com/espressif/esp-idf/issues/8660#issuecomment-1183744562, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AR3JADPHBXG4Q5KX7LTOM53VT47JHANCNFSM5RTWF5DA. You are receiving this because you were mentioned.Message ID: @.***>

BitSalat commented 2 years ago

@davidallenmann Thanx a lot for your input, helps me to understand the amplitude issue. In fact I have totally missed to check sensitivity but now it's clear that with -26dBFS at 94dBspl we will never get a higher amplitude with a normal speech signal. As we need more amplitude I am just considerung to use an mic array or electret mic to achieve this.

davidallenmann commented 2 years ago

I2S in v4.4 with PDM microphone has an exact 2x pitch shift in the recording. @BitSalat mentioned this where they had to change wav file sample rate to 16 kHz when they told I2S to use 32 kHz.

L-KAYA commented 2 years ago

Found the bug, rx_conf.rx_mono and rx_conf.rx_mono_fst_vld should be 0, but they are 1 on v4.4, it's now fixing.

L-KAYA commented 2 years ago

The bug has been fixed on release/v4.4 in commit 53a5d51a

Alvin1Zhang commented 2 years ago

Thanks for reporting, feel free to reopen.

igniterDJ commented 1 year ago

@BItSalat can u send the full code once please.