Open EDsteve opened 1 year ago
@EDsteve I've very little knowledge when it some to sound stuff & I imagine you've already considered this but do you think some of the noise shown in the 44kHz diagram is harmonics? Or perhaps some sort of noise originating from the board or device itself?
Additionally, what frequencies are typically generated by elephant calls? As you're no doubt aware, the sampling frequency needs to be at least twice the highest frequency ( i.e. the Nyquist frequency) that needs to be captured.
@OOHehir Okay. Thanks for checking it. I am pretty certain it is not noise from the PCB. And for our elephant project the 44kHz recording is not needed. But the ELOC will be used for other animals later on which needs 44kHz.
Maybe @LIFsCode can take a look when you have time. This is not high priority now. But sooner or later this issue needs to be solved. The I2S timings from ICS-43432 and ICS-43434 are almost the same. But not 100% if i interpreted the datasheet correctly. Maybe that is the cause?
@EDsteve Could you provide some additional info for reproducing the issue :
Looking at the 44kHz .wav file you posted I notice the following:
I wouldn't assume an issue with the I2S timing at this point, because this would be completly random throughout the entire recording. But the noise is added in concentrated blocks and there are several sections without any noise. I know this is no proof, but it makes a I2S timing issue as root cause less likely.
However whe I look at the ELOC 3.2 HW layout I noticed that the I2S lanes are crossed by the SDIO lanes. This could cause some EMI interference. Also the SD_DAT0 and WS_I2S_MIC are placed next to each other unterminated on the pin headers, this acts as an antenna and could couple the 2 signals. This is new on ELOC 3.2 and differs form ELOC 3.0 HW, which could explain the different behaviors of the 2 mics. I'm still highly conviced that the 2 mic should not make any difference (even though their dataheet is very unprecise) This could be a possible root cause at least it would explain the block like behavior of the noise, as the SD card is written block in blocks.
Ich can try to measure those signals but this won't be easy so I wouldn't expect too much. However I could try to changing the behavior of the SD card writing to see if the recording characteristic changes.
So my next steps will be:
@LIFsCode
@EDsteve Could you provide some additional info for reproducing the issue :
- Which FW version did you use
- Which config did you use? (full config please) what is the setting of "MicUseTimingFix"?
- What was ELOC 2.x HW? which mic did Tom developed the code for?
- Which power supply was used? Battery? Solar? USB?
After some tests: I connected the ICS-43432 externaly to the inheaders as the RIGHT channel and get the same "strange" sound recordings with both microphones (ICS-43434 and ICS-43432) on the ELOC 3.2 at 44KHz. Also uploaded the latest firmware to the ELOC 3.0 HW which uses the ICS-43432 and i get the same results.
I am siprised because I can swear that i had propper recordings with the ICS-43432 before. But i can't remember the circumstances or HW i used. I will do more tests and report back with more results.
@LIFsCode @EDsteve While testing the automatic gain feature I also noticed a rhythmic distortion, after ruling out some other options I traced it to occurring when SD writes were happening. It sound be possible to confirm this by changing the size (in seconds) of the wav buffer here. By increasing it from its current 2 secs, to say 3 or 4 you should notice that the period of the noise/ distortion changes, occurring when the SD is written to.
From memory the the distortion generally started after approx 7 or 8 secs in a file, & usually after an interval of low noise. Once loud noises occurred it disappeared but resumed again after an interval of 7 or 8 secs.
@EDsteve @LIFsCode Also I was using a sample rate of 16kHz
@LIFsCode To point 3. I am not sure if the 44KHz recordings have ever workd before with the INMP441 because it was never interested for us yet.
I have tested the ICS-43434 and the ICS-43432 on my breadboard setup (Only the ESP32 with an SD card. No sensors or anything else connected) and i am getting same results. Both microphones make noise at 44KHz. So it might not be the routing on the PCB? I also noticed that the noise always starts after 5 seconds. Within the first 5 seconds the sound is clean.
Here the recorded files from the breadboard inlcuding config (Heavy rain all night. So don't worry about the rain-noise :): ICS_43434 and ICS-43432 on breadboard.zip
@EDsteve : Sorry but I fear your breadboard setup is even worse from an EMI point ;-)
I don't have my setup available today and tomorrow, I could start tests on thursday.
@OOHehir could you increase the buffer time to the filesize, e.g. 20 seconds? If the theory of the SD card interfering with the I2S is correct, we would not see any distortion within the first file.
If not I could run the tests myself on thursday. I will also give it a try storing the wav files on the internal SPIFFS and only move them to the SD card after recording to get rid of any SD card transactions.
@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false?
Just in case it's not the SD card interference I would expect the following:
I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters
@LIFsCode Morning :) After some more tests i think we are getting closer too the root of the problem. It seems the APLL clock makes a big difference. All tests have been made on the ELOC 3.2 using the onboard ICS-43434. The recordings with the config files are in the atachments below. APLL true gives us problems with sound quality in 44K recordings. But reduces the power consumption a lot with 16K recordings. Hopefully these tests will help somehow to narrow down the problem. Let me know if i should do different tests.
Nr. | Sample Rate | APLL | TimeFix | Sound quality | Current Draw mA | Notes |
---|---|---|---|---|---|---|
1 | 44000 | TRUE | TRUE | distorted | 23 | nan |
2 | 44000 | TRUE | FALSE | distorted, high pitched | 24 | My voice sounds like on Helium |
3 | 44000 | FALSE | TRUE | Perfect | 25 | nan |
4 | 44000 | FALSE | FALSE | Perfect | 25 | nan |
5 | 16000 | TRUE | TRUE | Perfect | 18 | nan |
6 | 16000 | TRUE | FALSE | Perfect | 18 | nan |
7 | 16000 | FALSE | TRUE | Perfectisch | 27 | Spectrogram more blurry compared to test 5 and 6 |
8 | 16000 | FALSE | FALSE | Perfectisch | 26 | Spectrogram more blurry compared to test 5 and 6 |
9 | 44100 | TRUE | TRUE | distorted, high pitched | nan | Much higher pitch and more distortion compared to test2 |
10 | 44100 | TRUE | FALSE | distorted, high pitched | nan | Much higher pitch compared to test 2 |
@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false?
Just in case it's not the SD card interference I would expect the following:
- Change in volume -50%
- effect on the noise behavior if it's not caused by the SD card writing
I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters
MicUseTimingFix: Our code started with atomic14's code from two years ago. A freelancer modified the code to make it work with SDIO SD cards. This can be found here. Tom used that code, implemented it into our firmware and modified it further which is what you worked with. I can't find MicUseTimingFix in the idf-was-sdcard code. So Tom must have implemented that either to reduce power consumption or to fix other problems. Or to maybe brake thinks unintentionaly ;)
Cool many thanks that's good data and will help a lot
Atomic14 github was kind of the first guy who made I2S microphones work on the ESP32 for the maker scene. But he says that his code is outdated and is pointing to this repository, which should be much "better"? Not sure if that helps anybody. But maybe it can save you some time.
I just realized that this must have been a known issue for quite some time. At least there has been a code segment which forces apll=false if sampling rate > 32 kHz. But unfortunatelly without much explanation or details.
with your test results @EDsteve this seems to be exactly the problem addressed.
I will dig into it a bit more
@EDsteve I just realized the timing fix option was enabled during you last tests. Could you try with "MicUseTimingFix": false? Just in case it's not the SD card interference I would expect the following:
- Change in volume -50%
- effect on the noise behavior if it's not caused by the SD card writing
I must apologize for the "MicUseTimingFix" option this is the most confusing and worst documented option. I'm still not sure about what has been the original intention, but I will try to clarify it for the future and will replace it with more detailed better explained parameters
MicUseTimingFix: Our code started with atomic14's code from two years ago. A freelancer modified the code to make it work with SDIO SD cards. This can be found here. Tom used that code, implemented it into our firmware and modified it further which is what you worked with. I can't find MicUseTimingFix in the idf-was-sdcard code. So Tom must have implemented that either to reduce power consumption or to fix other problems. Or to maybe brake thinks unintentionaly ;)
Actually the so called "MicUseTimingFix" is more a "SPH0645_fix" that is also the name in atomics14's code
It actually has 2 effects:
Setting Panasonic format instead of MSB format for I2S, which is also set by i2s_mic_Config.communication_format = I2S_COMM_FORMAT_STAND_I2S
or equivalent i2s_mic_Config.communication_format = I2S_COMM_FORMAT_I2S
Shifting the receive window of I2S core by 2 clock samples (not clear which clock is meant, I have started a request to clarify this). NOTE: This clock shift is only required for sp605 which outputs data on the rising edge of CLK (I2S) which is the same edge, the ESP is sampling. However this has different effects if the MIC outputs the data on the falling edge (which ICS 43434 and 43432 and also INMP441 does). For these Mics using this option will change the receiver characteristic without understanding and adds error potential.
I will remove this option and replace it by optional config parameters which directly sets the respective ESP timing registers
@LIFsCode Morning :) After some more tests i think we are getting closer too the root of the problem. It seems the APLL clock makes a big difference. All tests have been made on the ELOC 3.2 using the onboard ICS-43434. The recordings with the config files are in the atachments below. APLL true gives us problems with sound quality in 44K recordings. But reduces the power consumption a lot with 16K recordings. Hopefully these tests will help somehow to narrow down the problem. Let me know if i should do different tests. Nr. Sample Rate APLL TimeFix Sound quality Current Draw mA Notes 1 44000 TRUE TRUE distorted 23 nan 2 44000 TRUE FALSE distorted, high pitched 24 My voice sounds like on Helium 3 44000 FALSE TRUE Perfect 25 nan 4 44000 FALSE FALSE Perfect 25 nan
5 16000 TRUE TRUE Perfect 18 nan 6 16000 TRUE FALSE Perfect 18 nan 7 16000 FALSE TRUE Perfectisch 27 Spectrogram more blurry compared to test 5 and 6 8 16000 FALSE FALSE Perfectisch 26 Spectrogram more blurry compared to test 5 and 6
9 44100 TRUE TRUE distorted, high pitched nan Much higher pitch and more distortion compared to test2 10 44100 TRUE FALSE distorted, high pitched nan Much higher pitch compared to test 2
@EDsteve Just a question about Test 7 & 8: Isn't the spectogram of 7 & 8 better than 5&6? In 5&6 I can see some periodic Blocks of noise, even though they are much less significant than with higher sample rates. While for 7&8(without APLL) I don't see them
@EDsteve Just a question about Test 7 & 8: Isn't the spectogram of 7 & 8 better than 5&6? In 5&6 I can see some periodic Blocks of noise, even though they are much less significant than with higher sample rates. While for 7&8(without APLL) I don't see them
@LIFsCode I should have mentioned that i did not make these recording under perfect conditions. Just recorded on my desk and these periodic blocks are actually hammering sound from my neighbour. Coincidentally only during the 440Hz tone :) After taking a closer look at the same two spectrograms. The blurry horizontal line around 6K tricked me into thinking that the recording is more "blurry" with APLL OFF. But it seems not. The 6K line is just inference which came from the buzzer (i am pretty certain). And just looks different when another clock is used it seems. But both recodings look fine to me. I can do better tests in my fridge if you wish?
About the APLL-"problem": Is it possible to toggle between APLL and non-APLL while the ELOC is running? In that case we can just use APLL for everything under 32KHz and APLL OFF for all above. Because at 44KHz the power draw seem to make no difference between APLL ON and OFF.
@EDsteve Yes that has been the workaround Tom used, to switch to APLL=false when using sample rates > 32 kHz So we could keep with that workaround and simply overwrite the APLL setting from the config in the meantime.
However based on your recording I would suspect a deeper issure. I'm pretty certain that the the blocks are not your neighbor, I can see them in the 16K_APLL-TRUE_TFIX-FALSE recording as well without hammering ;-)
But we are already on a good way :) so I can sum it up
{"config" : {"cpuMinFrequencyMHZ":80, "cpuEnableLightSleep": false}}
which should disable DFS@LIFsCode I am sorry to say. But i am still not so sure about the hammering. I rember that my neighbour was hammering a lot during that day and when i listen to the sound with earphones. That's exactly how it sounded. To find out. I will just make new recordings. Then we know and can hopefully eliminate this "noise" :) Will report back with new recordings soon.
Just to confirm: Does the ESP32 needs to restart in order to change between APLL OFF and ON? So when i select 44KHz recording. Can the ELOC switch to APLL OFF while the ELOC is running?
@LIFsCode
No hammering any more at night time :) It's a bit more effort, but if it helps you in any way to make sound tests in my fridge. I am ready :)
@LIFsCode I am sorry to say. But i am still not so sure about the hammering. I rember that my neighbour was hammering a lot during that day and when i listen to the sound with earphones. That's exactly how it sounded. To find out. I will just make new recordings. Then we know and can hopefully eliminate this "noise" :) Will report back with new recordings soon.
Just to confirm: Does the ESP32 needs to restart in order to change between APLL OFF and ON? So when i select 44KHz recording. Can the ELOC switch to APLL OFF while the ELOC is running?
No restart of the ESP is needed. But recording has to be off, when the config is changed.
Ok the new recordings seem better. Switching of APLL usage at higher sample rates is definetely a good workaround (Tom suggested >32 kHz). I can add that.
Anyway I would like to understand what is happening here. I don't feel good having a workaround of something which is not understood
@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.
@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.
@OOHehir Yes that's clear modifying the gain dynamically will distort the frequency spectrum. This will happen even if you limit the gain change to 0 crossing, even though limiting gain changes to 0 crossing will keep the signal in the time domain more clean. But effenct on the frequency domain will stay the same.
I'm not sure about your signal processing background, so sorry if the next things are completely obvious for you, but just as a note about the effect of dynamic gain adjustment on the specturm of the signal.
The question is, if is a problem or not for our application. I think this sould be discussed separately as part of the advantages and drawbacks of dynamic gain adjustment.
I would assume this is independent of the noise problems at high sample rates
Update on this one, sorry only a short one I come up with the details tomorrow.
I'm pretty certain I found a trace to track down the issue. I did some comparison with what I captured on the I2S bus and what is in the wave file. I found at least samples which are missing in the wave filed which have been sent by the mic.
This is a valuable trace, as random dropped samples would exactly cause the kind of noise in the spectrum.
I'm still not confident enough on this to call it a breakthrough, but it's a start. This could point to the DMA of the I2S engine in the ESP as a root cause.
The transfer from mic to esp seems fine. I have not found any corrupted samples, only missing samples.
I will do more tests and add a detailed result tomorrow
@LIFsCode Probably unrelated but the buffer setup used in the code, well, random:
.dma_buf_count = I2S_DMA_BUFFER_COUNT, // so 2000 sample buffer at 16khz sr gives us 125ms to do our writing .dma_buf_len = I2S_DMA_BUFFER_LEN, // 8 buffers gives us half second
From the branch I'm on this evaluates to: dma_buf_count = 18 dma_buf_len = 1000
This is the setup I've seen elsewhere: .dma_buf_count = 8, .dma_buf_len = 512,
Also .intr_alloc_flags = I2S_INTR_PIRO, evaluates to 1 << 2
The edge impulse code examples use .intr_alloc_flags = 0,
I haven't investigated the impact of the difference.
@EDsteve @LIFsCode Further to my last comment above the noise I was getting on the recordings only happens when the 'automatic gain' feature is enabled. Perhaps there is some distortion occurring when there is an abrupt change in sound level as the gain is changed. I'll investigate implementing the gain change at 'zero crossing', i.e. when the sound level is at, or very close to zero. Perhaps that could solve the issue.
@OOHehir Yes that's clear modifying the gain dynamically will distort the frequency spectrum. This will happen even if you limit the gain change to 0 crossing, even though limiting gain changes to 0 crossing will keep the signal in the time domain more clean. But effenct on the frequency domain will stay the same.
I'm not sure about your signal processing background, so sorry if the next things are completely obvious for you, but just as a note about the effect of dynamic gain adjustment on the specturm of the signal.
* Applying gain changes during runtime can be seen as a multiplcation in the time domain with a step or square wave funciton. E.g. changing gain from 0.25 to 1 is a multiplcation with with a square function [0.25; 1] ![grafik](https://user-images.githubusercontent.com/109753539/283141579-789bfdee-ff60-4a87-9736-083a771f350c.png) * In the frequency domain a square function will look something like [this](http://www.dspguide.com/ch11/2.htm) ![grafik](https://user-images.githubusercontent.com/109753539/283141959-c5aae100-7c7a-463c-8f98-05f3703b666c.png) * When you multiply your signal in the time domain this results in a [convolution ](http://www.dspguide.com/ch10/5.htm) in the frequency domain * This frequency domain convulation is inherent part of the gain adjustment during recording and will be ineviteable.
The question is, if is a problem or not for our application. I think this sould be discussed separately as part of the advantages and drawbacks of dynamic gain adjustment.
I would assume this is independent of the noise problems at high sample rates
It's been quite a while since I looked at this sort of stuff, thanks for the catch!
I will make this one a bit more extensive with the hope it will get some background for clarification of the issure.
Best guess so far is to overwrite the config setting of the min_freq_mhz based on the chosen sample frequency.
Since I don't have any better gues for a limit I suggest the 32 KHz of Tom's code
I wouldn't change the APLL setting and keep this to true, as it results in much better jitter stability (see 1.6)
I just saw that this fixed has removed during the AI mege (sorry haven't noticed it earlier.
Ich reopen this issue and fix it during the implementaiton of this https://github.com/LIFsCode/ELOC-3.0/issues/85
@EDsteve @OOHehir I don't think this is related to the missing files in https://github.com/LIFsCode/ELOC-3.0/issues/84 as the sample rate is too slow. But @EDsteve important to know for you that the Version 1.0 has this bug still in it.
I did some recordings in different sample rates and found out that there is noise at 44kHz. Sound recordings are in the 44vs32KHz.zip. But the spectrogram shows the noise as well at 44KHz. The curve on the right is not smooth and additional noise is introdduced. The curve seems to be shorter as well:
This happens with the ICS-43434 and did not happen with the older microphone (ICS-43432). Hope there is an easy fix.