Copied over from fooof-tools/fooof#124, original post by @rdgao with comments by @parenthetical-e :
Currently, peaks are identified based on a combination of user-defined threshold and comparison to the residual variance of the PSD after peak fitting.
Essentially, it compares the peak height to a theoretical null determined under the model of colored noise, which has a exponential power distribution at that frequency. Given the number of windows used to compute the PSD (degrees of freedom), one can compute a p-value under a pre-defined alpha to determine how likely the detected "peak" occurred by chance.
Since fooof is fitting an aperiodic component already, that would be the power under the null model which we would compare against.
Note that this will require additional inputs, namely, number of windows the user averaged over to compute the PSD (if using Welch's method).
Comment / Suggestion:
Define a separate stats sub module whose functions take fooof output (slope, amp, etc) and any other needed argument (eg window_n). This leaves the fooof api untouched.
Reply:
I agree with this 90% of the way, and the one integrated use case I can think of is to let the significance drive peak detection, i.e. iteratively toss out insignificant peaks and refit slope, though I'm not sure how you would then limit the fitting such that it doesn't find the same peak again.
Reply:
If you want to toss or down weight small peaks isn’t it better to use an explicit regularizer instead? Say: min L(x,y) + lamda*|n| where |n| is the peak number?
Adding a note here - a relevant question / request came up on the main repo asking about determining the goodness-of-fit / the error bar, per peak: https://github.com/fooof-tools/fooof/issues/238
Copied over from fooof-tools/fooof#124, original post by @rdgao with comments by @parenthetical-e :
Currently, peaks are identified based on a combination of user-defined threshold and comparison to the residual variance of the PSD after peak fitting.
One possible way to make this more theoretically grounded in stats is suggested here: https://atmos.washington.edu/~dennis/552_Notes_6b.pdf pg 167: statistical significance of spectral peaks
Essentially, it compares the peak height to a theoretical null determined under the model of colored noise, which has a exponential power distribution at that frequency. Given the number of windows used to compute the PSD (degrees of freedom), one can compute a p-value under a pre-defined alpha to determine how likely the detected "peak" occurred by chance.
Since fooof is fitting an aperiodic component already, that would be the power under the null model which we would compare against.
Note that this will require additional inputs, namely, number of windows the user averaged over to compute the PSD (if using Welch's method).
Comment / Suggestion: Define a separate stats sub module whose functions take fooof output (slope, amp, etc) and any other needed argument (eg window_n). This leaves the fooof api untouched.
Reply: I agree with this 90% of the way, and the one integrated use case I can think of is to let the significance drive peak detection, i.e. iteratively toss out insignificant peaks and refit slope, though I'm not sure how you would then limit the fitting such that it doesn't find the same peak again.
Reply: If you want to toss or down weight small peaks isn’t it better to use an explicit regularizer instead? Say: min L(x,y) + lamda*|n| where |n| is the peak number?