TomasLenc / acf_tools

Utilities for working with autocorrelation to capture periodicity in a signal.
GNU General Public License v3.0
2 stars 0 forks source link

irasa #11

Closed TomasLenc closed 9 months ago

TomasLenc commented 1 year ago

Adding an option to use IRASA to estimate the 1/f component, instead of parametrically fitting it using the original "fooof-like" approach.

TomasLenc commented 1 year ago

The original version of acf_tools was using a model-fitting approach to estimate the 1/f noise (inspired by the fooof package).

But this approach may be problematic, as it relies on fitting a predefined function to the data. As discussed in this paper by Gerster et al. 2022, the function may not be a correct model, especially for physiological data such as EEG. This would result in a wrong fit, eventually biasing the ACF after the estimated 1/f component is subtracted from the spectrum.

An alternative method is IRASA, which consists of two steps:

  1. estimating the 1/f component using resampling
  2. fitting parameters to the estimated 1/f component

Good news is that we only need step 1 for our purposes, which doesn't require fitting an explicit function to the data. Rather, the 1/f is estimated based on resampling, under the assumption that the 1/f fractal noise is invariant after resampling the data, while this operation suppresses any periodic components.

The only parameters that need to be set are the resampling factors. As described in the quote from Gerster et al. below, the choice of the highest resampling factor limits the bandwidth of the signal: image In other words, let's say we choose the highest resampling factor to be 2. In that case, there will be a resampling iteration that will upsample the data by a factor of 2. This means that a frequency that was nyquist/2 Hz under the original sampling rate will become nyquist Hz. Consequently, any frequency higher than nyquist/max_resampling_factor Hz in the original sampling rate will be out of the valid frequency range and should be ignored.

However, we cannot simply ignore the frequenies above nyquist/max_resampling_factor Hz because we need the whole frequency range in order to eventually calculate the autocorrelation function.

So what I did is the following. I chose the highest resampling factor to be 2. This gives us a valid 1/f estimate up to nyquist/2 Hz. For the remaining frequencies up to nyquist, I simply pad the 1/f estimate with zeroes. The rationale is that the power of 1/f will be very small in this higher frequency range anyway.

Here is an example fit to EEG resting-state noise mixed with a simulated periodic signal.

image

Zooming on the low-frequency region shows that irasa may fit the data better, as it is more flexible compared to the 2-parameter fooof model. image

TomasLenc commented 1 year ago

I have also run another simulation, explicitly comparing how well the two methods (fooof and irasa) perform under different snr conditions.

The procedure was as follows:

  1. simulate periodic signal based on a seamless repetition of a rhythmic pattern
  2. calculate the ground truth acf of the clean signal
  3. mix the signal with resting-state EEG noise to achieve a target SNR
  4. calculate acf after subtracting 1/f noise using fooof and correlate it with the ground truth acf
  5. calculate acf after subtracting 1/f noise using irasa and correlate it with the ground truth acf
  6. do this 100 times, separately for a range of SNRs

I did this simulation twice:

  1. only taking the acf obtained after 1/f was subtracted from the spectra
  2. taking the acf after 1/f was subtracted AND only harmonics of signal repetition frequency were retained

I've Fisher-transformed the correlation coefficients to make them normal.

The results are shown in the figure below: fooof_vs_irasa

As we can see, fooof seems to outperform irasa when the 1/f estimate is simply subtracted from the data. This is confirmed by fitting a linear model, which compares the Fisher-transformed correlation between the two methods. image

However, once only harmonics of signal repetition frequency are retained, the two methods perform equally well. image