LIFsCode / ELOC-3.0

Firmware for ELOC 3.0 Hardware
MIT License
3 stars 3 forks source link

44KHz recordings adds noise with ICS-43434 #30

Open EDsteve opened 1 year ago

EDsteve commented 1 year ago

I did some recordings in different sample rates and found out that there is noise at 44kHz. Sound recordings are in the 44vs32KHz.zip. But the spectrogram shows the noise as well at 44KHz. The curve on the right is not smooth and additional noise is introdduced. The curve seems to be shorter as well: 44K_prob

This happens with the ICS-43434 and did not happen with the older microphone (ICS-43432). Hope there is an easy fix.

OOHehir commented 1 year ago

@EDsteve I've very little knowledge when it some to sound stuff & I imagine you've already considered this but do you think some of the noise shown in the 44kHz diagram is harmonics? Or perhaps some sort of noise originating from the board or device itself?

Additionally, what frequencies are typically generated by elephant calls? As you're no doubt aware, the sampling frequency needs to be at least twice the highest frequency ( i.e. the Nyquist frequency) that needs to be captured.

EDsteve commented 1 year ago

@OOHehir Okay. Thanks for checking it. I am pretty certain it is not noise from the PCB. And for our elephant project the 44kHz recording is not needed. But the ELOC will be used for other animals later on which needs 44kHz.

Maybe @LIFsCode can take a look when you have time. This is not high priority now. But sooner or later this issue needs to be solved. The I2S timings from ICS-43432 and ICS-43434 are almost the same. But not 100% if i interpreted the datasheet correctly. Maybe that is the cause?

LIFsCode commented 12 months ago

@EDsteve Could you provide some additional info for reproducing the issue :

  1. Which FW version did you use
  2. Which config did you use? (full config please) what is the setting of "MicUseTimingFix"?
  3. What was ELOC 2.x HW? which mic did Tom developed the code for?
  4. Which power supply was used? Battery? Solar? USB?
LIFsCode commented 12 months ago

Looking at the 44kHz .wav file you posted I notice the following:

grafik grafik

I wouldn't assume an issue with the I2S timing at this point, because this would be completly random throughout the entire recording. But the noise is added in concentrated blocks and there are several sections without any noise. I know this is no proof, but it makes a I2S timing issue as root cause less likely.

However whe I look at the ELOC 3.2 HW layout I noticed that the I2S lanes are crossed by the SDIO lanes. This could cause some EMI interference. Also the SD_DAT0 and WS_I2S_MIC are placed next to each other unterminated on the pin headers, this acts as an antenna and could couple the 2 signals. This is new on ELOC 3.2 and differs form ELOC 3.0 HW, which could explain the different behaviors of the 2 mics. I'm still highly conviced that the 2 mic should not make any difference (even though their dataheet is very unprecise) This could be a possible root cause at least it would explain the block like behavior of the noise, as the SD card is written block in blocks.

grafik

Ich can try to measure those signals but this won't be easy so I wouldn't expect too much. However I could try to changing the behavior of the SD card writing to see if the recording characteristic changes.

So my next steps will be:

  1. Try to reproduce the original issue with my HW
  2. Measuring the I2S signal quality
  3. Adjusting the SD card write behavior to check for influences on the noise
EDsteve commented 12 months ago

@LIFsCode

@EDsteve Could you provide some additional info for reproducing the issue :

  1. Which FW version did you use
  2. Which config did you use? (full config please) what is the setting of "MicUseTimingFix"?
  3. What was ELOC 2.x HW? which mic did Tom developed the code for?
  4. Which power supply was used? Battery? Solar? USB?
  1. Too long ago. I assume the firmware which was on github on 10th October :)
  2. Can't say any more. Sorry.
  3. It was developed for the INMP441
  4. LiFePo4 Battery

After some tests: I connected the ICS-43432 externaly to the inheaders as the RIGHT channel and get the same "strange" sound recordings with both microphones (ICS-43434 and ICS-43432) on the ELOC 3.2 at 44KHz. Also uploaded the latest firmware to the ELOC 3.0 HW which uses the ICS-43432 and i get the same results.

I am siprised because I can swear that i had propper recordings with the ICS-43432 before. But i can't remember the circumstances or HW i used. I will do more tests and report back with more results.

OOHehir commented 12 months ago

@LIFsCode @EDsteve While testing the automatic gain feature I also noticed a rhythmic distortion, after ruling out some other options I traced it to occurring when SD writes were happening. It sound be possible to confirm this by changing the size (in seconds) of the wav buffer here. By increasing it from its current 2 secs, to say 3 or 4 you should notice that the period of the noise/ distortion changes, occurring when the SD is written to.

From memory the the distortion generally started after approx 7 or 8 secs in a file, & usually after an interval of low noise. Once loud noises occurred it disappeared but resumed again after an interval of 7 or 8 secs.

OOHehir commented 12 months ago

@EDsteve @LIFsCode Also I was using a sample rate of 16kHz

EDsteve commented 12 months ago

@LIFsCode To point 3. I am not sure if the 44KHz recordings have ever workd before with the INMP441 because it was never interested for us yet.

I have tested the ICS-43434 and the ICS-43432 on my breadboard setup (Only the ESP32 with an SD card. No sensors or anything else connected) and i am getting same results. Both microphones make noise at 44KHz. So it might not be the routing on the PCB? I also noticed that the noise always starts after 5 seconds. Within the first 5 seconds the sound is clean.

Here the recorded files from the breadboard inlcuding config (Heavy rain all night. So don't worry about the rain-noise :): ICS_43434 and ICS-43432 on breadboard.zip

LIFsCode commented 12 months ago

@EDsteve : Sorry but I fear your breadboard setup is even worse from an EMI point ;-)

I don't have my setup available today and tomorrow, I could start tests on thursday.

@OOHehir could you increase the buffer time to the filesize, e.g. 20 seconds? If the theory of the SD card interfering with the I2S is correct, we would not see any distortion within the first file.

If not I could run the tests myself on thursday. I will also give it a try storing the wav files on the internal SPIFFS and only move them to the SD card after recording to get rid of any SD card transactions.

LIFsCode commented 12 months ago

@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false?

Just in case it's not the SD card interference I would expect the following:

I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters

EDsteve commented 12 months ago

@LIFsCode Morning :) After some more tests i think we are getting closer too the root of the problem. It seems the APLL clock makes a big difference. All tests have been made on the ELOC 3.2 using the onboard ICS-43434. The recordings with the config files are in the atachments below. APLL true gives us problems with sound quality in 44K recordings. But reduces the power consumption a lot with 16K recordings. Hopefully these tests will help somehow to narrow down the problem. Let me know if i should do different tests.

Nr. Sample Rate APLL TimeFix Sound quality Current Draw mA Notes
1 44000 TRUE TRUE distorted 23 nan
2 44000 TRUE FALSE distorted, high pitched 24 My voice sounds like on Helium
3 44000 FALSE TRUE Perfect 25 nan
4 44000 FALSE FALSE Perfect 25 nan
5 16000 TRUE TRUE Perfect 18 nan
6 16000 TRUE FALSE Perfect 18 nan
7 16000 FALSE TRUE Perfectisch 27 Spectrogram more blurry compared to test 5 and 6
8 16000 FALSE FALSE Perfectisch 26 Spectrogram more blurry compared to test 5 and 6
9 44100 TRUE TRUE distorted, high pitched nan Much higher pitch and more distortion compared to test2
10 44100 TRUE FALSE distorted, high pitched nan Much higher pitch compared to test 2

APLLvsTimingFix.zip

EDsteve commented 12 months ago

@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false?

Just in case it's not the SD card interference I would expect the following:

  • Change in volume -50%
  • effect on the noise behavior if it's not caused by the SD card writing

I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters

MicUseTimingFix: Our code started with atomic14's code from two years ago. A freelancer modified the code to make it work with SDIO SD cards. This can be found here. Tom used that code, implemented it into our firmware and modified it further which is what you worked with. I can't find MicUseTimingFix in the idf-was-sdcard code. So Tom must have implemented that either to reduce power consumption or to fix other problems. Or to maybe brake thinks unintentionaly ;)

LIFsCode commented 12 months ago

Cool many thanks that's good data and will help a lot

EDsteve commented 12 months ago

Atomic14 github was kind of the first guy who made I2S microphones work on the ESP32 for the maker scene. But he says that his code is outdated and is pointing to this repository, which should be much "better"? Not sure if that helps anybody. But maybe it can save you some time.

LIFsCode commented 12 months ago

I just realized that this must have been a known issue for quite some time. At least there has been a code segment which forces apll=false if sampling rate > 32 kHz. But unfortunatelly without much explanation or details.

with your test results @EDsteve this seems to be exactly the problem addressed.

I will dig into it a bit more

LIFsCode commented 12 months ago

@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false? Just in case it's not the SD card interference I would expect the following:

  • Change in volume -50%
  • effect on the noise behavior if it's not caused by the SD card writing

I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters

MicUseTimingFix: Our code started with atomic14's code from two years ago. A freelancer modified the code to make it work with SDIO SD cards. This can be found here. Tom used that code, implemented it into our firmware and modified it further which is what you worked with. I can't find MicUseTimingFix in the idf-was-sdcard code. So Tom must have implemented that either to reduce power consumption or to fix other problems. Or to maybe brake thinks unintentionaly ;)

Actually the so called "MicUseTimingFix" is more a "SPH0645_fix" that is also the name in atomics14's code

It actually has 2 effects:

  1. Setting Panasonic format instead of MSB format for I2S, which is also set by i2s_mic_Config.communication_format = I2S_COMM_FORMAT_STAND_I2S or equivalent i2s_mic_Config.communication_format = I2S_COMM_FORMAT_I2S grafik

  2. Shifting the receive window of I2S core by 2 clock samples (not clear which clock is meant, I have started a request to clarify this). NOTE: This clock shift is only required for sp605 which outputs data on the rising edge of CLK (I2S) which is the same edge, the ESP is sampling. However this has different effects if the MIC outputs the data on the falling edge (which ICS 43434 and 43432 and also INMP441 does). For these Mics using this option will change the receiver characteristic without understanding and adds error potential.

I will remove this option and replace it by optional config parameters which directly sets the respective ESP timing registers

LIFsCode commented 12 months ago

@LIFsCode Morning :) After some more tests i think we are getting closer too the root of the problem. It seems the APLL clock makes a big difference. All tests have been made on the ELOC 3.2 using the onboard ICS-43434. The recordings with the config files are in the atachments below. APLL true gives us problems with sound quality in 44K recordings. But reduces the power consumption a lot with 16K recordings. Hopefully these tests will help somehow to narrow down the problem. Let me know if i should do different tests. Nr. Sample Rate APLL TimeFix Sound quality Current Draw mA Notes 1 44000 TRUE TRUE distorted 23 nan 2 44000 TRUE FALSE distorted, high pitched 24 My voice sounds like on Helium 3 44000 FALSE TRUE Perfect 25 nan 4 44000 FALSE FALSE Perfect 25 nan

5 16000 TRUE TRUE Perfect 18 nan 6 16000 TRUE FALSE Perfect 18 nan 7 16000 FALSE TRUE Perfectisch 27 Spectrogram more blurry compared to test 5 and 6 8 16000 FALSE FALSE Perfectisch 26 Spectrogram more blurry compared to test 5 and 6

9 44100 TRUE TRUE distorted, high pitched nan Much higher pitch and more distortion compared to test2 10 44100 TRUE FALSE distorted, high pitched nan Much higher pitch compared to test 2

APLLvsTimingFix.zip

@EDsteve Just a question about Test 7 & 8: Isn't the spectogram of 7 & 8 better than 5&6? In 5&6 I can see some periodic Blocks of noise, even though they are much less significant than with higher sample rates. While for 7&8(without APLL) I don't see them

16 kHz APLL=true

grafik

16 kHz APLL = false

grafik

EDsteve commented 12 months ago

@EDsteve Just a question about Test 7 & 8: Isn't the spectogram of 7 & 8 better than 5&6? In 5&6 I can see some periodic Blocks of noise, even though they are much less significant than with higher sample rates. While for 7&8(without APLL) I don't see them

@LIFsCode I should have mentioned that i did not make these recording under perfect conditions. Just recorded on my desk and these periodic blocks are actually hammering sound from my neighbour. Coincidentally only during the 440Hz tone :) After taking a closer look at the same two spectrograms. The blurry horizontal line around 6K tricked me into thinking that the recording is more "blurry" with APLL OFF. But it seems not. The 6K line is just inference which came from the buzzer (i am pretty certain). And just looks different when another clock is used it seems. But both recodings look fine to me. I can do better tests in my fridge if you wish?

About the APLL-"problem": Is it possible to toggle between APLL and non-APLL while the ELOC is running? In that case we can just use APLL for everything under 32KHz and APLL OFF for all above. Because at 44KHz the power draw seem to make no difference between APLL ON and OFF.

LIFsCode commented 12 months ago

@EDsteve Yes that has been the workaround Tom used, to switch to APLL=false when using sample rates > 32 kHz So we could keep with that workaround and simply overwrite the APLL setting from the config in the meantime.

However based on your recording I would suspect a deeper issure. I'm pretty certain that the the blocks are not your neighbor, I can see them in the 16K_APLL-TRUE_TFIX-FALSE recording as well without hammering ;-)

But we are already on a good way :) so I can sum it up

Summay

Potential root causes:

  1. A tming issue within the receive I2S core which depends on the I2S core master clock: This can be evaluated by trying out differnt combinations for the I2S_TIMING_REG registers.
  2. A Dynamic Frequency Scaling (DFS) issue. I2S locks the DFS if used without APLL clock. This could be tested by setting {"config" : {"cpuMinFrequencyMHZ":80, "cpuEnableLightSleep": false}} which should disable DFS
  3. A jitter issue with APLL. Result seem exactly how a jittering clock would affect an Sigma Delta ADC, which is used in the I2S mic. APLL is supposed to be highly accurate, but this must be verified by measurements on I2S clock & WS signals.
EDsteve commented 12 months ago

@LIFsCode I am sorry to say. But i am still not so sure about the hammering. I rember that my neighbour was hammering a lot during that day and when i listen to the sound with earphones. That's exactly how it sounded. To find out. I will just make new recordings. Then we know and can hopefully eliminate this "noise" :) Will report back with new recordings soon.

Just to confirm: Does the ESP32 needs to restart in order to change between APLL OFF and ON? So when i select 44KHz recording. Can the ELOC switch to APLL OFF while the ELOC is running?

EDsteve commented 12 months ago

@LIFsCode APLL_16K_no-hammer

No hammering any more at night time :) It's a bit more effort, but if it helps you in any way to make sound tests in my fridge. I am ready :)

16K_APLL_now_hammer.zip

LIFsCode commented 12 months ago

@LIFsCode I am sorry to say. But i am still not so sure about the hammering. I rember that my neighbour was hammering a lot during that day and when i listen to the sound with earphones. That's exactly how it sounded. To find out. I will just make new recordings. Then we know and can hopefully eliminate this "noise" :) Will report back with new recordings soon.

Just to confirm: Does the ESP32 needs to restart in order to change between APLL OFF and ON? So when i select 44KHz recording. Can the ELOC switch to APLL OFF while the ELOC is running?

No restart of the ESP is needed. But recording has to be off, when the config is changed.

Ok the new recordings seem better. Switching of APLL usage at higher sample rates is definetely a good workaround (Tom suggested >32 kHz). I can add that.

Anyway I would like to understand what is happening here. I don't feel good having a workaround of something which is not understood

OOHehir commented 12 months ago

@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.

LIFsCode commented 12 months ago

@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.

@OOHehir Yes that's clear modifying the gain dynamically will distort the frequency spectrum. This will happen even if you limit the gain change to 0 crossing, even though limiting gain changes to 0 crossing will keep the signal in the time domain more clean. But effenct on the frequency domain will stay the same.

I'm not sure about your signal processing background, so sorry if the next things are completely obvious for you, but just as a note about the effect of dynamic gain adjustment on the specturm of the signal.

The question is, if is a problem or not for our application. I think this sould be discussed separately as part of the advantages and drawbacks of dynamic gain adjustment.

I would assume this is independent of the noise problems at high sample rates

LIFsCode commented 12 months ago

Update on this one, sorry only a short one I come up with the details tomorrow.

I'm pretty certain I found a trace to track down the issue. I did some comparison with what I captured on the I2S bus and what is in the wave file. I found at least samples which are missing in the wave filed which have been sent by the mic.

This is a valuable trace, as random dropped samples would exactly cause the kind of noise in the spectrum.

I'm still not confident enough on this to call it a breakthrough, but it's a start. This could point to the DMA of the I2S engine in the ESP as a root cause.

The transfer from mic to esp seems fine. I have not found any corrupted samples, only missing samples.

I will do more tests and add a detailed result tomorrow

OOHehir commented 11 months ago

@LIFsCode Probably unrelated but the buffer setup used in the code, well, random:

.dma_buf_count = I2S_DMA_BUFFER_COUNT, // so 2000 sample buffer at 16khz sr gives us 125ms to do our writing .dma_buf_len = I2S_DMA_BUFFER_LEN, // 8 buffers gives us half second

From the branch I'm on this evaluates to: dma_buf_count = 18 dma_buf_len = 1000

This is the setup I've seen elsewhere: .dma_buf_count = 8, .dma_buf_len = 512,

Also .intr_alloc_flags = I2S_INTR_PIRO, evaluates to 1 << 2

The edge impulse code examples use .intr_alloc_flags = 0,

I haven't investigated the impact of the difference.

OOHehir commented 11 months ago

@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.

@OOHehir Yes that's clear modifying the gain dynamically will distort the frequency spectrum. This will happen even if you limit the gain change to 0 crossing, even though limiting gain changes to 0 crossing will keep the signal in the time domain more clean. But effenct on the frequency domain will stay the same.

I'm not sure about your signal processing background, so sorry if the next things are completely obvious for you, but just as a note about the effect of dynamic gain adjustment on the specturm of the signal.

* Applying gain changes during runtime can be seen as a multiplcation in the time domain with a step or square wave funciton. E.g. changing gain from 0.25 to 1 is a multiplcation with with a square function [0.25; 1]
  ![grafik](https://user-images.githubusercontent.com/109753539/283141579-789bfdee-ff60-4a87-9736-083a771f350c.png)

* In the frequency domain a square function will look something like [this](http://www.dspguide.com/ch11/2.htm)
  ![grafik](https://user-images.githubusercontent.com/109753539/283141959-c5aae100-7c7a-463c-8f98-05f3703b666c.png)

* When you multiply your signal in the time domain this results in a [convolution ](http://www.dspguide.com/ch10/5.htm) in the frequency domain

* This frequency domain convulation is inherent part of the gain  adjustment during recording and will be ineviteable.

The question is, if is a problem or not for our application. I think this sould be discussed separately as part of the advantages and drawbacks of dynamic gain adjustment.

I would assume this is independent of the noise problems at high sample rates

It's been quite a while since I looked at this sort of stuff, thanks for the catch!

LIFsCode commented 11 months ago

Status Update & Solution

I will make this one a bit more extensive with the hope it will get some background for clarification of the issure.

1. Experiments and observations

  1. "Noise" effects occur @41kHz. Reported even with rates of > 32 kHz (Tom's code), but no data for this sample rate available.
  2. Issue only observed with setting "APLL = true"
  3. "Timing fix" has no provable effect on this issue, discussed in this topic.
  4. Effects of Power Management (PM))
    • Issue only observed with Power Management (PM) enabled
    • Light sleep has no influence
    • Changing min CPU Freq to >= 20 will --> no issues seen anymore
  5. I2S bus measurements:
    • I2S Signals are all looking good (analog & digital)--> It is not a problem of the output of the ESP, nor of the I2S mic
    • APLL = true results in significant lower jitter on WS & SCK --> This is generally good for sound qualits. Jitter on Sigma-Delta ADCs as in an I2S mic will result in noise within the signal
    • Data of the I2S mic is changed on the correct CLK edge --> no need for "Timing fix"
    • I cannot add the measurements here to they are too big ~1 GB
  6. I2S Timing registers do not affect the issue (this matchtes with 6, as I2S looks good)
  7. Comparing the raw WAV File content with the capture from the I2S bus
    • No bit flips have been observed throughout the whole wav file
    • Some samples which have been seen on the I2S bus are missing within the WAV file.
    • This are single samples (~6 single samples missing within a 128 sample block)before and after everything looks good
    • This shows a screenshot of a comparison (left I2S capture, right WAV file in HEX-Viewer) grafik

2. Excluded error causes

  1. "Timing fix" is most likely not related
  2. PCB EMI effects can be excluded (otherwise it wouldn't work without APLL)
  3. Jitter or clock quality can be excluded (see 6)
  4. No Problem within the I2S input stage of the ESP (no bit flips see 8). No Timing no EMI issue, this would result in bit flips not in whole samples missing
  5. APLL is not the cause of the issue: APLL witout PM will work fine (see 4). However APLL usage affects the the Power mode of the ESP32. Without APLL, the I2S will take the ESP_PM_APB_FREQ_MAX lock, resulting the CPU to run in min_freq_mhz
  6. The WAV Writer can be excluded because this would result in larger conseucutive blocks of missing data (at least 1 DMA frame), not single samples

3. Explanation

4. Solution

Best guess so far is to overwrite the config setting of the min_freq_mhz based on the chosen sample frequency.

Since I don't have any better gues for a limit I suggest the 32 KHz of Tom's code

I wouldn't change the APLL setting and keep this to true, as it results in much better jitter stability (see 1.6)

LIFsCode commented 8 months ago

I just saw that this fixed has removed during the AI mege (sorry haven't noticed it earlier.

Ich reopen this issue and fix it during the implementaiton of this https://github.com/LIFsCode/ELOC-3.0/issues/85

@EDsteve @OOHehir I don't think this is related to the missing files in https://github.com/LIFsCode/ELOC-3.0/issues/84 as the sample rate is too slow. But @EDsteve important to know for you that the Version 1.0 has this bug still in it.