sheaivey / ESP32-AudioInI2S

A simple MEMS I2S microphone and audio processing library for ESP32.
MIT License
49 stars 11 forks source link

Feature Request: Ability to set frequency range #3

Closed chinswain closed 1 month ago

chinswain commented 3 months ago

Thanks for creating this library, I am using it to gather frequency and amplitude data from my beehive, I then display in Grafana as a heat map.

Would it be possible to add a function to set the frequency range? I wanted to have 50Hz - 1000Hz across 16 buckets.

sheaivey commented 3 months ago

First off I love how you are using this library!

For the feature request I would need to look into how to extract FFT buckets into their actual corresponding frequencies. The library originally was just made for visual approximations and to work with a variety of MIMS microphones.

It looks like it is not too complicated to calculate out based on this stack overflow question. https://stackoverflow.com/questions/4364823/how-do-i-obtain-the-frequencies-of-each-value-in-an-fft

chinswain commented 3 months ago

The plan is to wake up every 15 minutes, listen to the sound and FFT it into 16 buckets then send the values over LoRa back home for graphing, so knowing the actual ranges will be needed to make use of the data (And maybe some ML\AI type stuff one day...).

It's amazing how much information could be potentially retrieved from audio inside a beehive:

"Swarming is indicated by an increase in the power spectral density at about 110 Hz; approaching to swarm the sound augmented in amplitude and frequency to 300 Hz, occasionally a rapid change occurred from 150 Hz to 500 Hz."

3000 Hz means defensive reaction, Queen pipping between 340 Hz, 450 Hz swarming, Fanning 225 Hzā€“285 Hz

There's a commercial device called BEEP that ues a range of 71 to 583 Hz divided over 10 bins: 71-122, 122-173, 173-224, 224-276, 276-327, 327-378, 378-429, 429-480, 480-523, 532-583 Hz.

I think something like:

#include "Arduino.h"

// Include your arduinoFFT library or other FFT implementation
#include <arduinoFFT.h>

#ifndef SAMPLE_SIZE
#define SAMPLE_SIZE 1024
#endif

class AudioAnalysis {
public:
    AudioAnalysis();

    void computeFFT(int32_t *samples, int sampleSize, int sampleRate);
    void computeFrequencies(float minFreq, float maxFreq, uint8_t bandSize = BAND_SIZE);

};

void AudioAnalysis::computeFFT(int32_t *samples, int sampleSize, int sampleRate) {
    _samples = samples;
    _sampleSize = sampleSize;
    _sampleRate = sampleRate;

    // Prepare samples for analysis
    for (int i = 0; i < _sampleSize; i++) {
        _real[i] = samples[i];
        _imag[i] = 0;
    }

    _FFT->dcRemoval();
    _FFT->windowing(FFTWindow::Hamming, FFTDirection::Forward, false);
    _FFT->compute(FFTDirection::Forward);
    _FFT->complexToMagnitude();
}

void AudioAnalysis::computeFrequencies(float minFreq, float maxFreq, uint8_t bandSize) {
    // Calculate which bins correspond to the desired frequency range
    float binWidth = _sampleRate / (2.0 * SAMPLE_SIZE); // Frequency resolution per bin
    int minBin = minFreq / binWidth;
    int maxBin = maxFreq / binWidth;

    // Ensure maxBin does not exceed the maximum available bin index
    maxBin = min(maxBin, SAMPLE_SIZE / 2 - 1);

    // Number of bins in the specified range
    int binsInRange = maxBin - minBin + 1;

    // Now adjust BAND_SIZE based on the binsInRange
    bandSize = min(bandSize, binsInRange);

    // Clear previous band data
    for (int i = 0; i < bandSize; i++) {
        _bands[i] = 0;
    }

    // Calculate FFT for the selected frequency range
    int offset = minBin;
    for (int i = 0; i < bandSize; i++) {
        // Calculate average magnitude over the bin range
        float sum = 0;
        for (int j = 0; j < binsInRange / bandSize; j++) {
            sum += sqrt(_real[offset + j] * _real[offset + j] + _imag[offset + j] * _imag[offset + j]);
        }
        _bands[i] = sum / (binsInRange / bandSize);
        offset += binsInRange / bandSize;
    }
}

float *AudioAnalysis::getBands() {
    return _bands;
}

I guess this is well outside of the scope of your library - maybe a good excuse for a new one! Single shot mode - returns X number of bands (With known ranges) as a single output for X number of seconds?

I had to make some small changes to support the SPH0645 (If anyone else is trying to use this mic).

AudioInI2S.h

#include "soc/i2s_reg.h"
...
AFTER: i2s_driver_install(_i2s_port_number, &_i2s_config, 0, NULL);
  REG_SET_BIT(  I2S_TIMING_REG(_i2s_port_number),BIT(9));  
  REG_SET_BIT( I2S_CONF_REG(_i2s_port_number), I2S_RX_MSB_SHIFT);**
BEFORE:   i2s_set_pin(_i2s_port_number, &_i2s_mic_pins);

I've setup a Co2 and I2S mic to sit directly inside the beehive, it's just sending amplitude back at the moment. 20240609_121954

I've setup 4 hives with 20w of solar each so hopefully plenty of power. image

A bonus queen bee pic!

image

chinswain commented 3 months ago

Not sure if it helps but this sketch prints out the peak frequency and amplitude:

#include <arduinoFFT.h>

#define SAMPLE_BUFFER_SIZE 512
#define SAMPLE_RATE 8000
#define SAMPLING_DURATION_SECONDS 1 // Adjust sampling duration as needed

i2s_config_t i2s_config = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,
    .communication_format = I2S_COMM_FORMAT_I2S,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 4,
    .dma_buf_len = 1024,
    .use_apll = false,
    .tx_desc_auto_clear = false,
    .fixed_mclk = 0};

i2s_pin_config_t i2s_mic_pins = {
    .bck_io_num = 14,
    .ws_io_num = 32,
    .data_out_num = I2S_PIN_NO_CHANGE,
    .data_in_num = 15};

#define FFT_SAMPLES SAMPLE_BUFFER_SIZE
double vReal[FFT_SAMPLES];
double vImag[FFT_SAMPLES];

ArduinoFFT<double> FFT = ArduinoFFT<double>(vReal, vImag, FFT_SAMPLES, SAMPLE_RATE);

int32_t raw_samples[SAMPLE_BUFFER_SIZE];

void setup() {
  Serial.begin(115200);
  i2s_driver_install(I2S_NUM_0, &i2s_config, 0, NULL);
  i2s_set_pin(I2S_NUM_0, &i2s_mic_pins);
}

void loop() {
  // Initialize variables
  double maxAmplitude = 0.0;
  double maxPeakFrequency = 0.0;

  // Record data for SAMPLING_DURATION_SECONDS seconds
  unsigned long startTime = millis();
  while (millis() - startTime < SAMPLING_DURATION_SECONDS * 1000) {
    size_t bytes_read = 0;
    i2s_read(I2S_NUM_0, raw_samples, sizeof(int32_t) * SAMPLE_BUFFER_SIZE, &bytes_read, portMAX_DELAY);
    int samples_read = bytes_read / sizeof(int32_t);

    // Convert the samples to double for FFT
    for (int i = 0; i < samples_read; i++) {
      vReal[i] = (double)raw_samples[i] / 2147483648.0; // Normalize the 32-bit sample to -1.0 to 1.0 range
      vImag[i] = 0.0; // FFT requires imaginary part to be 0
    }

    FFT.windowing(FFT_WIN_TYP_HAMMING, FFT_FORWARD);
    FFT.compute(FFT_FORWARD);
    FFT.complexToMagnitude();

    // Calculate the amplitude (volume)
    double amplitude = 0.0;
    for (int i = 0; i < FFT_SAMPLES; i++) {
      if (vReal[i] > amplitude) {
        amplitude = vReal[i];
      }
    }

    // Find the peak frequency
    double peakFrequency = FFT.majorPeak();

    // Update max values if current values are higher
    if (amplitude > maxAmplitude) {
      maxAmplitude = amplitude;
    }
    if (peakFrequency > maxPeakFrequency) {
      maxPeakFrequency = peakFrequency;
    }
  }

  // Print the maximum values recorded during the sampling duration
  Serial.printf("Max Peak Frequency: %.2f Hz\t Max Amplitude: %.4f\n", maxPeakFrequency, maxAmplitude);

}
sheaivey commented 3 months ago

Thank you so much for sharing your project! I'm super impressed! šŸ¤šŸ

I think that I'm going to add a function to the AudioAnalysis class called getFrequencyRange(minHz, maxHz) which will return the normalized amplitude set from normalize(bool normalize = true, float min = 0, float max = 1). This would allow you to create your own frequency buckets.

I'll also add a function getBandFrequency(uint8_t index) to get the Hz value associated with the bucket index created by computeFrequencies(uint8_t bandSize) that way some one could display the Hz name under the visualization bars or create a name for the bucket.

chinswain commented 3 months ago

That would be amazing - I'll send you some honey as a thank you :)

sheaivey commented 2 months ago

šŸÆ Just to add some update. I started down the path of modifying the existing AudioAnalysis and realized its kinda a monster to shoehorn this feature in. SO I wound up creating two new classes and that make it much easier to work with for creating frequency range buckets with normalization, peaks, and auto leveling.

The new analysis classes AudioFrequencyAnalysis and FrequencyRange (still deciding on the name) will look something like below.

// this class creates a bucket for frequencies between a range of Hz to reside in. 
// Example: FrequencyRange bass(20, 199); // get all the low end frequencies
// Example: FrequencyRange mid(200, 1999); // get all the mid range frequencies
// Example: FrequencyRange high(2000, 20000); // get all the high frequencies
// Example: FrequencyRange vu(20, 20000); // get the full human audible frequency range
// It takes care of normalizing and storing peaks and auto leveling for that range of frequencies.
class FrequencyRange {
  // ...
  FrequencyRange(lowHz, highHz);
  float getValue(); // gets normalized value
  float getPeak(); // gets normalized peak value
  uint16_t getMaxFrequency();  // gets the highest amplitude Hz from within frequency range.
  void loop() // called from AudioFrequencyAnalysis, calculates the frequency range amplitude from FFT
  //...
}

// this is the primary class that calculates the FFT and then manages all the FrequencyRanges added to it.
class AudioFrequencyAnalysis {
  //...
  void loop(samples, sampleSize, sampleRate); // computes the FFT and loops over all the FrequencyRanges that have been added to it.
  void addFrequencyRange(FrequencyRange *_frequencyRange); // add FrequencyRange to manage (add up to 64 FrequencyRange)
  uint16_t getMaxFrequency(); // gets the highest amplitude Hz from all the samples.
  FrequencyRange * getMaxFrequencyRange(); // returns the FrequencyRange with the highest amplitude.
  //...
}

I probably have a few days left until the new classes are complete and ready for use but my initial tests are yielding good results and it is a lot easier to work with the Frequencies.

chinswain commented 2 months ago

Thanks so much for doing this! If you are interested in the results I'll update here once I've started collecting the data. I intend to capture the frequency during various events, such as swarming and queenless.

sheaivey commented 2 months ago

OK, I have updated the develop branch to have the new FrequencyRange, AudioFrequencyAnalysis classes mentioned above.

Here are some examples.

examples/FrequencyRange/FrequencyRange.ino Simple serial output of some FrequencyRanges

examples/TTGO-T-Display/FrequencyRange-Visuals/FrequencyRange-Visuals.ino Playing with lots of FrequencyRanges

I have not had time to document the new classes but hopefully the examples are of some help. They seem to be working good.

Some things to note: FrequencyRange.getValue() // returns raw amplitude value (You might want to use this one for your project) FrequencyRange.getValue(0, 255) // returns normalized amplitude value based on a rolling min/max (It's constantly adjusting the gain) *FrequencyRange.getPeak() vs getPeak(0,255) // raw or normalized functionality

FrequencyRange.getMax() // returns the raw max amplitude ever seen FrequencyRange.getMin() // returns the raw min amplitude ever seen

FrequencyRange.getMaxFrequency() // returns the highest amplitudes frequency in Hz within the frequency range.

FrequencyRange(50,200); // my calculation for getting all amplitudes returned from the FFT in this range are summed together. This means that ranges are not equal to each other and should be interpreted on their own. FrequencyRange(200,2000); // this range spans more FFT buckets so its max value could be much larger.

However the audible spectrum is logarithmic so higher frequencies need more FFT buckets because they produce smaller amplitudes. You can create a spreadsheet to help create the frequency ranges that compensates for this. I attempted to add a _highFrequencyRollOffCompensation for this reason but its not perfect.

Hopefully these classes will get you what you were after. It's not perfect but It seems to work well enough that I can hum out notes and extract the frequency and then convert that into a midi note to use with my synthesizer.

chinswain commented 2 months ago

I'm on holiday for a few weeks - I'll have a play when I'm back :)

sheaivey commented 2 months ago

Looking forward to hearing how testing goes when you are back from holiday.

Also just a heads up I have merged these new classes into master and added some documentation.

chinswain commented 1 month ago

Hey!

This is working great!, I've had some ICS-43434 PCB's made up, seems a much better mic than the old INMP441.

Do you have any recommendations for a "one shot" gathering of data? I was thinking something like just calling the loop example as needed:

void getSound(){

  mic.read(samples); // Stores the current I2S port buffer into samples.
  audioInfo.computeFFT(samples, SAMPLE_SIZE, SAMPLE_RATE);
  audioInfo.computeFrequencies(BAND_SIZE);

  float *bands = audioInfo.getBands();
  float *peaks = audioInfo.getPeaks();
  float vuMeter = audioInfo.getVolumeUnit();
  float vuMeterPeak = audioInfo.getVolumeUnitPeak();

  // Send data to serial plotter
  for (int i = 0; i < BAND_SIZE; i++)
  {
    Serial.printf("%dHz:%.1f,", audioInfo.getBandName(i), peaks[i]);
  }

  // also send the vu meter data
  Serial.printf("vuValue:%.1f,vuPeak:%.2f", vuMeter, vuMeterPeak);

}

I need to create a better mic enclosure to go inside the beehive now, the bees seem to dislike any gaps so keep sealing up the holes.

sheaivey commented 1 month ago

That's the old way of getting the full ~50Hz to 20,000KHz. Which will work but does not give you fine control over exact frequency ranges.

Here is an example of the new way which is a little more lengthy but also explicit in what each bucket represents. You can make your own logarithmic frequency ranges and compensation. https://github.com/sheaivey/ESP32-AudioInI2S/blob/develop/examples/TTGO-T-Display/FrequencyRange-Visuals/FrequencyRange-Visuals.ino#L50-L113