Remove simulateSound() in favour of getAudioData()

blazoncek commented 1 month ago

This is supposed to simplify getting either simulated sound or actual audio data for effects. Builds on the work by @netmindz .

netmindz commented 1 month ago

I'll try and take a look over in the next few days.

Please can you write a quick summary of what you were trying to do so I'm not having to guess that from the code

blazoncek commented 1 month ago

Please can you write a quick summary of what you were trying to do so I'm not having to guess that from the code

Just condense 2 functions into one.

willmmiles commented 1 month ago

Just condense 2 functions into one.

Heh, I was going to suggest keeping the two parts as separate functions in the source! It makes more sense to me to keep the code for "selecting the sound source" isolated from the code for "computing the audio data simulation". (That said: moving getAudioData() to utils.cpp, and marking simulateSound() static, should allow the compiler to inline simulateSound and reduce the total number of function calls.)

DedeHai commented 1 month ago

From an FX view I like the consolidation approach better, it should not be up to an effect to choose a 'system setting'

willmmiles commented 1 month ago

From an FX view I like the consolidation approach better, it should not be up to an effect to choose a 'system setting'

Was this meant to be a reply to me?

To clarify what I mean, I think that this code:

// utils.cpp
static um_data_t* simulateSound(uint8_t simulationId) {
  // .. body of simulateSound
}

um_data_t* getAudioData(uint8_t simulationId) {
  um_data_t *um_data;
  if (UsermodManager::getUMData(&um_data, USERMOD_ID_AUDIOREACTIVE)) {
    return um_data;
  }
  return simulateSound(simulationId);
}

... is a better organization than this code:

// utils.cpp
um_data_t* getAudioData(uint8_t simulationId) {
  um_data_t *um_data;
  if (UsermodManager::getUMData(&um_data, USERMOD_ID_AUDIOREACTIVE)) {
    return um_data;
  }

  // .. body of simulateSound
}

While the API to the FX layer and the compiled results should be identical, I think there's some practical value in arranging the source code to keep individual tasks ("how do I decide where to get some audio data?" and "how do I compute audio simulation?") as separate functions. IMO it can make things easier to follow and maintain, and clearly scopes which local variables are relevant to what part of the overall task.

That's all I was trying to suggest! Bikeshedding at its finest. ;)

DedeHai commented 1 month ago

That makes a lot of sense actually, I misunderstood.

netmindz commented 1 month ago

I'm happy with the basic principle of the change, a single method with "fallback" of simulation where there is no data, I don't have an objections of such a method per say.

What I'm not really clear on is why the change - what are we expecting to achieve by it? My original change was to remove a load of copy and paste code into a common method. Given we are still calling a single method from the effects and the implementation is basically the same.

If there is some subtle difference that means passing the simulationId is more performant or reduces code size, then that should be a change in its own right really, explained as the reasoning in the PR with evidence of the improvement. Without the context for the change I'm a little lost to understand how this change improve the code

blazoncek commented 1 month ago

a single method with "fallback" of simulation

That was the whole idea from the start as there is really no reason to have 2 functions performing the same. It is ok with me to abandon this PR if there is no visible/perceivable benefit from it.

netmindz commented 1 month ago

I don't feel it really helps improve the code readability and if it's not performance related then it's just decreasing code quality by blending expected operations with graceful fallback

softhack007 commented 1 month ago

I agree with @netmindz - plus the code is not time critical because it's running once per effect.

DedeHai commented 4 weeks ago

@softhack007 regarding AR on C3 & S2 (abusing this PR for an AR question): I see there is plans to reduce sampling rate for C3/S2, is there work in progress somewhere?

I just ran a few tests on a C3 with AR and there is great potential:

reducing to 16kHz AND reducing sample size to 256 brings the FFTtime down to 19ms
FFT_SPEED_OVER_PRECISION and FFT_SQRT_APPROXIMATION make it slightly slower (maybe 0.5ms) so there is no point in using those
Frequency range with 16kHz and 256 samples is 62.5Hz - 8kHz (vs. 43Hz - 11 kHz with 22kHz sampling and 512 samples)

I think that is a reasonable trade-off and works quite well. S2 should use the same as it also does not have an FPU.

Are there any plans to support analog mic on S3, S2, C3? Especially the C3 would give users on a tight budget some options, or for projects where you want a lot of distributed cheap AR lights. I did not look at the code in detail yet but ADC sampling is quite easy to implement using DMA (in case you did not look into that already: here is some example code https://github.com/ClaudeMarais/ContinousAnalogRead_ESP32-C3/blob/main/ContinousAnalogRead_ESP32-C3.ino ) I am not (yet) familiar with the ESP DMA controller, but apparently it is quite flexible, I am quite sure it can do ring-buffer or dual buffer or one-shot, whichever works best for AR. I did use DMA on a STM32: I assume ESP has the same basic functionalities.

Edit: just for reference, these are the code lines used for 19ms FFTtime on C3:

constexpr SRate_t SAMPLE_RATE = 16000;       
#define FFT_MIN_CYCLE 30           // Use with 16Khz sampling
constexpr uint16_t samplesFFT = 256;  
constexpr uint16_t samplesFFT_2 = 128;

softhack007 commented 4 weeks ago

@softhack007 regarding AR on C3 & S2 (abusing this PR for an AR question): I see there is plans to reduce sampling rate for C3/S2, is there work in progress somewhere?

Hi @DedeHai yes there is a lot in progress, over at the MoonModules fork. The code is somewhat cluttered, but now that license topics are solved, I'm planning to bring improvements back to upstream gradually.

Your proposal to reduce sampling rates for C3 (and maybe for S2, too) makes sense. In fact S2 is still two times faster on FFT than C3, so its "edgy" but reducing the sampling rate to 18khz should be enough for the chip.

Reducing FFT size is something I was also considering - but this reduces frequency resolution a lot, especially in the lower range. For -C3, it looks like another trick works well, while keeping the frequency resolution as it is: in MM we are skipping every second FFT run on C3. This has only minor impact on the result, but it allows C3 to take about 30-40ms for one FFT. Works well.

For the future, the plan is to replace ArduinoFFT with esp-dsp. This requires to lift "classic esp32" to arduino-esp32 version 2.0.x, but it will be a major step forward. It will also allow us to replace my hand-crafted IIR filters with standard biquad filters from esp-dsp. @troyhacks has made some experiments and it looks very promising. FFT on S2 and C3 will also be faster, even when using float without FPU.

Update: there is another option for speeding up audioreactive - a revolution instead of evolution. If we want 16 GEQ "bands" and not more, then we could drop the complete FFT part, and run a bank of 16 well-tuned frequency bandpass filters instead. Computationally the effort is similar to FFT (16 filters x 512 samples each 20ms). We could use biquad filters written for fixed-point. So we'd exactly get the frequency bands we want, plus it's more efficient due to fixed point. Right now this is just an idea, but it might be worth to explore.

softhack007 commented 4 weeks ago

About analog mic support for -S3/-S2/-C3: For me this is very low priority, because we currently use the I2S driver and making an "alternate" driver that's using continuous analog sampling mode means to dig into a completely different tech. I have no experience with the continuous ADC driver.

Also I don't expect good results from esp ADC sampling, because the ADC unit is well-known for low accuracy and hight noise levels. So the FFT spectrum will contain lots of artefacts and "ghost frequencies". Plus most analog mics are very sensitive to noise on the Vcc line -basicially any wifi access can cause a noise burst on analog reading. We have been through many troubles in the old Atuline soundreactive - and honestly speaking, most user should really invest into a cheap I2S microphone (like INMP441) or even a cheaper PDM microphone - the headaches and troubles with analog input aren't worth it.

There are good line-in to I2S boards that we support - availeable for cheap. So line-in audio is no reason for analog imho.

Actually if you want to create a continuous ADC input driver, be my guest ;-) You'd basicially create a new audioSource instance derived from the AudioSourceclass and create your own initialize and getSamples methods.

Simply to keep ends together, my only wish is that - in case you find something - please make a PR in the MoonModules fork first. Because MM is my "master" for new audioreactive stuff.

DedeHai commented 4 weeks ago

thanks, you make valid points. how does the "skipping every other frame" on C3 affect FPS? I see you re-did the binning in MM, in AC even the lowest bins are summed, so switching to 256 samples and changing binning would not result in much resolution loss (which is what I was thinking). I am not sure which option is better though, skipping audio frames or losing resolution. I would not gear the C3 towards "highest audio quality" because if users want that, ESP32 or S3 are the only good options here. I do not have the gear nor the experience you guys have to test any difference. I only bring some theoretical background to the table (acoustics at university but that was quite some time ago).

Regarding analog mic: I am well aware of the down sides, I do precision ultra low noise analog electronics for a living, not audio though ;) You also need to take into account that we have users where a 0.3$ analog mic vs a 1.5$ digital one makes a difference (had this exact example on discord), so yes, digital mic over analog anytime but I do not see that as a reason to drop support for analog. I may run a test when I get around to it to see if I can make the ADC-DMA sampling work on a C3 but I have a lot of other things on my list right now

softhack007 commented 4 weeks ago

Hi @DedeHai thanks for sharing your thoughts

how does the "skipping every other frame" on C3 affect FPS?

In fact it does not affects FPS directly, because FFT is running in a separate task and effects do not wait for a new batch of results. Indirectly the CPU load for FFT is actually impacting effects, but I've never measured it in detail on C3. I was more concerned to keep FFT within the time limits of 1-2 batches of samples, because lagging behind audio, and glitches due to buffer saturation, is very noticeable for user.

The FFT task is doing some post-processing and smoothing, so skipping every other FFT did not lead to any jumping in effects.

I see you re-did the binning in MM, in AC even the lowest bins are summed, so switching to 256 samples and changing binning would not result in much resolution loss (which is what I was thinking)

The problem comes in the low frequencies - with 256 samples you lose a lot of detail in the low "GEQ bands", so I tried to avoid that.

I only bring some theoretical background to the table (acoustics at university but that was quite some time ago).

Me-too 😉 I've had acoustics as a secondary subject (3 Semesters) when I studied computer science (called "Informatik" or "Kybernetik" in Germany in the 1990's). I think that's a great basis for understanding what's needed.

softhack007 commented 1 week ago

Frequency range with 16kHz and 256 samples is 62.5Hz - 8kHz (vs. 43Hz - 11 kHz with 22kHz sampling and 512 samples)

@DedeHai about this side-topic on making audioreactive faster on -C3 ....

I had some time to let your idea "sink in" - you're right it's a good "quick win" improvement for -C3. The final solution of replacing ArduinoFFT is still some time away. But changing FFT width from 512 to 256 could be done easily - just a few constants plus a different mapping from FFT bins to fftResult[] channels. Maybe this "low quality" mode could even be an option (build flag) for -S2.

Its seems that people tend to ignore any warnings about -C3 being slow (too slow for audioreactive) - they just attach a mic and recognise their board is running hot while effects get slow. So having low resolution FFT might be an interesting way to make -C3 a bit faster on running effects.

If I find time this week, i can make a PR and would be happy to have your review and testing on -C3.

DedeHai commented 1 week ago

I would be happy to test it. AR on C3 has it's right to exist IMHO: not every one uses it for party-displays where ms timing and best resolution matters. Some just want it to react to loudness, there are probably hundreds of examples where the quality of the signal is really not that important. What I would also like to check: is it better to have more base resolution (i.e. using every second sample at 512 FFT width) or having better reaction time but loose base resolution (i.e. reduce to 256 samples but process every frame). Or could even be ok to do both, reduce and only do every second frame. What I would like to test here is record some music and compare the FFT bins generated for a short sequence.

Aircoookie / WLED

Remove simulateSound() in favour of getAudioData() #4209