[ENH] - Add getter functions for data & model components

TomDonoghue commented 1 year ago

This PR is an exploration related to our broader topic / discussion regarding the 'spacing' of the data, relating to whether we want, for example, parameter values in log or linear spaced values, and whether we want to have a compositional model which adds the components in log or linear space. This PR deals with the isolated data & model components.

The FOOOF object stores the data, resulting model, as well as internal attributes of isolated data & model components.

Specifically:

"data" attributes:
- 'full' data: FOOOF.power_spectrum
- 'aperiodic only' data: FOOOF._spectrum_peak_rm
- 'periodic only' data: FOOOF._spectrum_flat
"model" attributes:
- 'full' model: FOOOF.fooofed_spectrum_
- 'aperiodic only' data: FOOOF._ap_fit
- 'periodic only' data: FOOOF._peak_fit

These are all available, but some are indicated to be private, such that there is not a "proper" way to access them. More importantly, these components, which are computed and stored as an additive model in log space are not necessarily what one wants if one wants an additive linear model. Without this update, getting component values in linear space, that follow a linearly additive model is otherwise not particularly obvious - and yet when one does an analysis such as "computing aperiodic removed power", using the linear representation is arguably the way to go.

This adds getter functions to access data & model components, officially supporting access to all of these attributes. Additionally, in getting these components, it allows for specifying a 'space' argument of 'log' or 'linear'. If set to 'log', you get the same as you would accessing the attributes directly (values in log10 space, and following the "additive in log space" combination). If set to 'linear', you get recomputed components in linear space. Notably, this is not just the "unlogged" component, but rather the transformed component such that the values are in linear space, and it follows the "additive in linear space" model - so linear AP + PE = linear power_spectrum (or model). For some, perhaps many use cases, when accessing these components, this linearly additive model is arguably what one wants.

Notes

This PR is only about data & model components, and does not handle computing different spacings / additive models in parameter values - an update which is a bigger change, and is in progress for 2.0. Since this update is not a breaking change, it's currently targeted for 1.1 (might as well).

Reviewing

Note - at this stage I want to check in on this idea (review comments on the idea rather than details of the implementation are welcome). There are some details to double check if we like this idea, including things like naming (I'm not totally sold on the method / argument names - thoughts welcome).

Example

The notable example(s) from this update are really about getting the linear values.

Getting linear data components:

data_full_lin = fm.get_data('full', 'linear')
data_ap_lin = fm.get_data('aperiodic', 'linear')
data_pe_lin = fm.get_data('peak', 'linear')
plot_spectra(fm.freqs, [unlog(fm.power_spectrum), data_full_lin, data_ap_lin + data_pe_lin], 
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

This is gives data components in linear-spaced values - that also add up in linear space, as seen from the output plot:

We can do the same for model components:

model_full_lin = fm.get_model('full', 'linear')
model_ap_lin = fm.get_model('aperiodic', 'linear')
model_pe_lin = fm.get_model('peak', 'linear')
plot_spectra(fm.freqs, [powers, model_full_lin, model_ap_lin + model_pe_lin], 
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

Full code for exploring examples

from fooof import FOOOF
from fooof.core.utils import unlog
from fooof.plts import plot_spectra
from fooof.sim import gen_power_spectrum
from fooof.utils.download import load_fooof_data

# Load example spectra - using real data here
freqs = load_fooof_data('freqs.npy', folder='data')
powers = load_fooof_data('spectrum.npy', folder='data')

fm = FOOOF(verbose=False)
fm.fit(freqs, powers)

# Data - log
data_full_log = fm.get_data_component('full', 'log')
data_ap_log = fm.get_data_component('aperiodic', 'log')
data_pe_log = fm.get_data_component('peak', 'log')
plot_spectra(fm.freqs, [fm.power_spectrum, data_full_log, data_ap_log + data_pe_log], 
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

# Model - log
model_full_log = fm.get_model_component('full', 'log')
model_ap_log = fm.get_model_component('aperiodic', 'log')
model_pe_log = fm.get_model_component('peak', 'log')
plot_spectra(fm.freqs, [fm.power_spectrum, model_full_log, model_ap_log + model_pe_log],
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

# Data - linear
data_full_lin = fm.get_data_component('full', 'linear')
data_ap_lin = fm.get_data_component('aperiodic', 'linear')
data_pe_lin = fm.get_data_component('peak', 'linear')
plot_spectra(fm.freqs, [unlog(fm.power_spectrum), data_full_lin, data_ap_lin + data_pe_lin], 
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

# Model - linear
model_full_lin = fm.get_model_component('full', 'linear')
model_ap_lin = fm.get_model_component('aperiodic', 'linear')
model_pe_lin = fm.get_model_component('peak', 'linear')
plot_spectra(fm.freqs, [powers, model_full_lin, model_ap_lin + model_pe_lin], 
             linestyle=['-', 'dotted', 'dashed'], colors=['black', 'red', 'green'],
             alpha=[0.3, 0.75, 0.75], figsize=(6, 5))

ryanhammonds commented 1 year ago

This approach makes sense to me and will be particularly useful for people not as familiar with the private attributes.

Maybe direct attribute access could be added in addition to the get methods? I like accessing the model with fm._ap_fit and it's less verbose than fm.get_model_component('aperiodic', 'log'), so maybe just an attribute with fm.ap_fit so it's less hidden? Un-logging could be handled with an @property method for something like fm.ap_fit_lin.

voytek commented 1 year ago

Yeah, this makes sense to me, too. Do we have a note somewhere to put this in the tutorials / examples, to show how log(x-y) isn't the same as log(x) - log(y), using a practical, simulated, example spectrum?

TomDonoghue commented 1 year ago

Yeh, to @ryanhammonds's point, this does add a bit of verboseness to accessing this stuff - I think the balancing act is that I hope this approach is more explicit, even if it's a bit longer (and for the power user, accessing _ap_fit directly or similar will still work.

The alternative is to update to have more explicitly accessible component attributes, and use properties, so model access could be something like this (example for the periodic model component - we would then do similar across all components):

@property
def periodic_fit_log(self):
    return self._peak_fit

@property
def periodic_fit_lin(self):
    return  unlog(self.fooofed_spectrum_) - unlog(self._ap_fit)

The above does have the benefit of somewhat easier / more direct access. However, it does also start adding more public attributes to the objects, in a way that is a bit more "spread out", and in a way that I feel would be sorta less documented (since it might be less clear, for example, that the periodic_fit_lin is not just the unlogged peak fit, for example). We also already have a convention of having get / set / add function - so in that sense I think the current approach might be more consistent with that. At the end of the day - it could go either way - anyone have strong feelings on getters vs properties?

Somewhat related - I do think if keep getters we could trim the names, maybe to get_model and get_data? Then accessing the log AP fit would look like fm.get_model('aperiodic') which doesn't feel too far off accessing _ap_fit.

To @voytek's point - yeh, associated with this PR should be some updates to the new plot_data_components example (examples/models/plot_data_components), and then perhaps check through for anywhere else that shows these attributes. For the bigger discussion of linear vs log, the broader set of documentation is still targeted to 2.0 (in my mind at least).

ryanhammonds commented 1 year ago

Sounds good! I'm okay with either option you/others would prefer. One thing I like about the attribute method is being able to tab to autocomplete or list possible attr options so I don't have to read the docs when I forget the specs of a gettr func.

One way to prevent a ton of attribute clutter could be to have a separate data class that pulls the private attributes from fm so it would be something like fm.data.ap_fit_log or fm.model.ap_fit_log or similar.

TomDonoghue commented 1 year ago

@ryanhammonds - oohhh, that's an interesting idea to do something like object.data... / object.model.... Maybe hold that thought - I've already been exploring separating out data & model sub-objects in #291 - but hadn't thought of linking them like that! That would be a bigger level change that should be targeted to 2.0 I think.

Related to which - merging this in here at 1.1 doesn't necessarily mean we have to keep the same thing in 2.0 - so for now perhaps we just merge this in as it is for 1.1, and we can update and check through for 2.0 if we want to make a bigger overall change and add / switch to supporting attribute access. Does that sound good? If so, I made some quick fixes / updates here to the code, and updated the example - let me know if you want to do a code check & have any comments, and then I can merge this in to push it through to 2.0 and keep editing from there!

TomDonoghue commented 1 year ago

Awesome! Will merge this in now for 1.1, and then we can explore more options in 2.0!

fooof-tools / fooof