neurodsp-tools / neurodsp

Digital signal processing for neural time series.
https://neurodsp-tools.github.io/
Apache License 2.0
281 stars 61 forks source link

[MNT] - Update filter type checks & inference #322

Closed TomDonoghue closed 7 months ago

TomDonoghue commented 1 year ago

Responds to #321

Previously, one could end up with an inconsistent set of input parameters to the general filter_signal function - for example, passing in butterworth_order without explicitly setting IIR filter_type. This update switches filter_type to default to None, an if not explicitly set, it is inferred from the given parameters, with additional updates to check that the given set of parameters are consistent.

TomDonoghue commented 1 year ago

Forgot to tag people who might be interested in this update: @QuirinevEngen, @mwprestonjr, & @voytek - does this make sense / look good in terms of an update related to the filter type discussion that started on slack and described in #321

voytek commented 1 year ago

This looks totally reasonable to me. If Q and MJ don't seen any issue, then please merge.

TomDonoghue commented 7 months ago

@QuirinevEngen & @mwprestonjr - just following up here on whether you have any thoughts / suggestions on this suggested update about filter specification? if it looks good to y'all, I think we can merge this in

TomDonoghue commented 7 months ago

Hey @mwprestonjr - thanks for checking in, and for the suggestions!

I do like both suggestions, but they are both complicated by the current default definition of n_cycles as 3 (in filter_signal and in filter_signal_fir). Because this makes n_cycles always default to 3, it can't be checked against None, nor obviously inferred if the user has changed it (we could check if the value is not 3, but that's not super ideal).

Because of this, I think the main options for better management of n_cycles would be:

Also cc'ing @ryanhammonds for any thoughts on this

ryanhammonds commented 7 months ago

It does feel odd that n_cycles defaults to 3, but n_seconds can be passed to implicitly overwrite. Defaulting to n_cyles=None and then defaulting to n_cycles=3 within the func seems logical without breaking much.

The mixing of these fir and iir kwargs seems tricky. Next time a breaking release is made, it may be worth considering something like:

def filter_signal(sig, fs, pass_type, f_range, fir_params=None, iir_params=None,
                  print_transitions=False, plot_properties=False, return_filter=False):

This would explicitly separate the two sets of kwargs and maybe simplify the checks a bit, but comes at the cost of breaking things.

TomDonoghue commented 7 months ago

Yeh, I totally agree with Ryan - and missed this option in thinking through this before: by defaulting n_cycles to None in the function call, but checking and setting it to 3 if nothing else is defined lets us do the checks we want, without changing any current behavior. I pushed a commit to do this! Note that we do now check that not both n_cycles and n_seconds are defined (and fail if so), but this is done in the FIR code, not the checks related to filter_signal, since it is related to calling filter_signal_fir directly as well.

TomDonoghue commented 7 months ago

I was thinking about @ryanhammonds other point about separating the arguments in separate dictionaries, and I agree - and I think we can do a version of this without it (really) being breaking. In thinking of this, I was thinking that filter_signal should really match compute_spectrum - they are both "wrapper" function providing access to multiple version of the thing they do. I think we don't necessarily need fir_params & iir_params, but could instead have filter_params, where the contents should match the filter type (which is what compute_spectrum does with **kwargs. Even better, we can update to use filter_params right now, without really changing how anything functions.

In the latest commit, I dropped all arguments in filter_signal that are filter type specific, and added **filter_kwargs. Doing so leads to changing the _filtertype_checks functions into a more general _filter_input_checks which checks that the provided arguments are all consistent with the specified (or inferred) filter type.

This all works in a way that is almost non-breaking. Notably, I didn't have to change any tests to work, and because of how the inputs / defaults are managed in the sub-functions, none of the default values for non-explicitly passed in arguments should change.

The almost part of the non-breaking part is that the API for filter_signal does necessarily change a bit - it still accepts all the same inputs, but the order switches a bit, so there are things that would work before, that will now error - but because of the order of the previous version, I actually think they are constructions that are pretty unlikely (it's only function calls with a large numbers on unnamed values passed in).

For example, the following would have previously given valid FIR filter with n_cycles of 5: filter_signal(sig, fs, 'bandpass', (8, 12), 5) This would now fail, because the last input gets processed as filter type instead of n_cycles.

For IIR, it's even more tenious - this is the simplest inputs that would previously work but now fail: filter_signal(sig, fs, 'bandpass', (8, 12), 3, None, True, 7)

I don't think I ever use filter_signal in a way that would actually be broken by this change, and I would tend to think it's not very likely many people do (basically because the previous API had a lot of parameters in a not super obvious order, so it would be more likely to use keyword arguments instead of inputs by order for anything after the first ~3 inputs, I think.

So - what do we think of the updated organization?

Sidenote: if we like it, it's not quite finished - the docs might need cleaning up a bit to render nicely, and we should probably add a test for _filter_input_checks.

If we don't like this - I'll revert the last commit, and we can go back to keeping all the other updates in the PR

mwprestonjr commented 7 months ago

I like the latest changes!

ryanhammonds commented 7 months ago

Looks good! I wouldn’t worry too much about about the almost not breaking part - I’d argue named kwargs should be used instead of inferring args as kwargs based on order.

I like the kwarg checking too - my one complaint with **kwargs is accidentally misspelling a kwarg and the func silently accepting. But since there are checks on them, this avoids the issue.

TomDonoghue commented 7 months ago

Sounds good, this seems to be the way to go then! I did some last clean ups on the docs and adding a test to the checks, and so if this looks good to all, then we should be good to go here!

By the way, the docstring looks slightly odd in plain text, but this way leads to this representation on the sphinx site: Screen Shot 2024-02-23 at 10 33 19 PM

I'm definitely open to suggestions for tweaking this (I'm not aware of a clear standard for how to document exactly what we are trying to document here) - but as far as I can figure out this is the cleanest version that looks ~good in both plaintext and docsite!