neurodsp-tools / neurodsp

Digital signal processing for neural time series.
https://neurodsp-tools.github.io/
Apache License 2.0
283 stars 61 forks source link

Tech-Audit #1: Code & Analysis #193

Open TomDonoghue opened 4 years ago

TomDonoghue commented 4 years ago

This issue is part of a 'technical audit' of neurodsp, with @elybrand.

The first task / goal is to check through the code and in particular, all the analysis functions. For the relevant sub-modules, we primarily want to check that everything, as implemented, is technically sound, and that the settings and implementations are appropriate for our purposes. It's also an opportunity to identify any places we could add extensions to functionality, and/or update and replace current approaches, if there appear to be better options available.

In the case of filt, timefrequency, and spectral, these sub-modules implement basic and standard DSP procedures for the application to neuroscientific signals. The goal here is to check that these functions perform as expected, and look through to check for any possible issues with the current implementations - doing a check & audit for correctness.

In the case of burst and rhythm, these sub-modules implement particular algorithms that have been proposed for the analysis of neuroscientific data. In this case, I think it makes sense to perhaps have a look at these functions and, in so far as you can, check that the implementations seem technically valid (that there are no mistakes in the calculations). This doesn't need to address whether these algorithms are ideal / well designed for there stated tasks (I think that's somewhat out of scope for this review).

How you go about doing this is largely up to you! As you mentioned, I think using some simulations and checking that outputs are all as expected makes sense (these can be collected and later used to augment our code tests). For anything that comes up, feel free to comment it here, start a chat on Slack, and/or open a new issue for anything we should address.

elybrand commented 4 years ago

So I finished reading through the filter tutorial and submodule. I have a few comments, which I'll list in decreasing urgency.

Your method of implementing an IIR filter risks running into some serious numerical instability. In iir.py, apply_iir_filter(), you call the filtfilt method. For high-order butterworth filters, this could lead to disaster. Instead, you should use sosfiltfilt(). This requires only minor modifications. Namely, when calling scipy.signal.butter(), use output='sos'. Feed this, instead of the b values and a values, into sosfiltfilt.

Here's a minimal working example, inspired from the filter tutorial, which shows how these two methods diverge when the butterworth order is 5.

import numpy as np
from neurodsp.filt import filter_signal
from neurodsp.sim import sim_combined
from neurodsp.utils import create_times
from neurodsp.plts.time_series import plot_time_series
from scipy.signal import butter, sosfiltfilt

# Set the random seed, for consistency simulating data
np.random.seed(0)

# General setting for simulations
fs = 1000
n_seconds = 5

# Generate a times vector, for plotting
times = create_times(n_seconds, fs)

# Set the frequency in our simulated signal
freq = 6

# Set up simulation for a signal with an oscillaton + noise
components = {"sim_powerlaw": {"exponent": 0}, "sim_oscillation": {"freq": 6}}
variances = [0.1, 1]

# Simulate our signal
sig = sim_combined(n_seconds, fs, components, variances)

# Define a frequency range to filter the data
f_range = (4, 8)
butterworth_order = 5

# Bandpass filter the data, across the band of interest
sig_filt = filter_signal(
    sig, fs, "bandpass", f_range, filter_type="iir", butterworth_order=butterworth_order
)

# Now compare to using sosfilt.
sos = butter(butterworth_order, f_range, btype="bandpass", output="sos", fs=fs)
sig_filt_sos = sosfiltfilt(sos, sig)

plot_time_series(times, [sig, sig_filt_sos], ["Raw", "SOS Filtered"])
plot_time_series(times, [sig, sig_filt], ["Raw", "Filtered"])

I'm happy to incorporate these changes if you'd like. Just let me know if so.

If you're curious as to why this helps the stability, the reason is because sos stands for "second order series". It factors the transfer function into a product of rational functions where the numerators and denominators are quadratics, rather than higher order polynomials. These factors can now be iteratively applied to get the result.

elybrand commented 4 years ago

In fir.py, the compute_filter_length() function forces the filter length to be odd. In effect, this means you'll always be using a Type I filter. Is this intentional?

elybrand commented 4 years ago

In both fir.py and iir.py, when you call scipy.signal.firwin (resp. scipy.signal.butter), you feed in the normalized frequencies, whereby I mean you've divided by the nyquist frequency.

In the case of firwin, the nyq key word argument is deprecated at least as of scipy 1.50. You can now feed in the sampling rate directly!

This is also the case for butter(). You can optionally feed in the sampling rate as a keyword argument. I did this in my minimal working example in the comment above.

elybrand commented 4 years ago

In iir.py, design_iir_filter() throws a warning if the pass_type is not bandstop. Why are you throwing this warning? Butterworth filters can certainly be used for lowpass and highpass filters, despite that their effective transition band can be quite large.

elybrand commented 4 years ago

Just finished working through the time-frequency submodule. Here are some comments in decreasing order of urgency.

In hilbert.py, your instantaneous phase computation can generate a discontinuous phase-curve. This is mildly problematic for when you compute instantaneous frequency, as you're trying to differentiate a non-differentiable function. It looks like you fudge the instantaneous phase in the freq_by_time() function by adding 2*pi to all locations where the pairwise differences are negative. A more elegant way of addressing this would be to use np.unwrap(np.angle(robust_hilbert()) inside phase_by_time. This will give you a monotone increasing phase. Further, when a user plots instantaneous phase and instantaneous frequency, it will be clearer that one is the derivative of the other.

Here's a minimal working example to show you the difference between what is currently outputted and what using np.unwrap would give you:

import numpy as np
from neurodsp.sim import sim_combined
from neurodsp.utils import create_times
from neurodsp.plts.time_series import plot_time_series
from neurodsp import timefrequency as tf
from matplotlib import pyplot as plt

# Set the random seed, for consistency simulating data
np.random.seed(0)

# General setting for simulations
fs = 1000
n_seconds = 5

# Generate a times vector, for plotting
times = create_times(n_seconds, fs)

# Set the frequency in our simulated signal
freq = 6

# Set up simulation for a signal with an oscillaton + noise
components = {"sim_powerlaw": {"exponent": 0}, "sim_oscillation": {"freq": 6}}
variances = [0, 1]

# Simulate our signal
sig = sim_combined(n_seconds, fs, components, variances)

analytic_sig = tf.robust_hilbert(sig, increase_n=False)
instantaneous_phase = tf.phase_by_time(sig, fs)
instantaneous_phase_cots = np.unwrap(np.angle(analytic_sig))
instantaneous_frequency = tf.freq_by_time(sig, fs)
fig, axes = plt.subplots(2,1, figsize=(8,6))
fig2, axes2 = plt.subplots(2,1, figsize=(8,6))
plot_time_series(times, instantaneous_phase, ax=axes[0])
plot_time_series(times, instantaneous_frequency, ax=axes[1])
fig.suptitle("Misleading Differentiation of Discontinuous Phase", fontsize=16)

plot_time_series(times, instantaneous_phase_cots, ax=axes2[0])
plot_time_series(times, instantaneous_frequency, ax=axes2[1])
fig2.suptitle("Differentiation of Continuous Phase", fontsize=16)

I will incorporate this change with a small PR if you think it's worth including.

elybrand commented 4 years ago

Inside wavelets.py, calling compute_wavelet_transform() returns a time-frequency diagram where time is on the vertical axis and frequency is plotted on the horizontal axis. This goes against the conventions that I typically see when looking at time-frequency diagrams. It's usually the case that time is on the x-axis and frequency is on the y-axis. I think this is probably just an aesthetic point, but users who are accustomed to calling wavelet transforms like scipy.signal.cwt may be confused by this change of convention.

TomDonoghue commented 4 years ago

Thank you @elybrand for great detailed comments! Here I'll do some quick replies to the points for filtering, and tag in lab members as needed to check anything.

Your method of implementing an IIR filter risks running into some serious numerical instability. In iir.py, apply_iir_filter(), you call the filtfilt method. For high-order butterworth filters, this could lead to disaster. Instead, you should use sosfiltfilt(). This requires only minor modifications. Namely, when calling scipy.signal.butter(), use output='sos'. Feed this, instead of the b values and a values, into sosfiltfilt.

Thanks for the working example, very useful to see the problem! And like, woah. To the best of my knowledge, the updates you propose make sense to me. Unless @rdgao has any more thoughts on this, I think we can move forward with your proposed updates, if you can start a PR for this.

The only current implementation detail that comes to mind is that this update seems like it might have some downstream effects of requiring updates to how we manage filter coefficients in sub-functions such as compute_frequency_response (?). These seem like implementation level things, and I think can be dealt with in the PR.

In fir.py, the compute_filter_length() function forces the filter length to be odd. In effect, this means you'll always be using a Type I filter. Is this intentional?

I'm going to pitch this one to @rdgao, because I'm not sure on the answer.

In both fir.py and iir.py, when you call scipy.signal.firwin (resp. scipy.signal.butter), you feed in the normalized frequencies, whereby I mean you've divided by the nyquist frequency.

This, as I understand it, is basically suggesting to update to use newer API elements from scipy, without changing any functionality? Under that understanding, I'd say yes, let's do that! Do you want to add this to a general filter PR? (Or, if you prefer, make it a separate PR).

In iir.py, design_iir_filter() throws a warning if the pass_type is not bandstop. Why are you throwing this warning? Butterworth filters can certainly be used for lowpass and highpass filters, despite that their effective transition band can be quite large.

Yeah, as I understand it, the original warning was to suggest against what might be a somewhat un-ideal (though valid) filter design. Within neuroscience, my understanding is that IIR's are generally only really used for line noise bandstop, and for other use cases FIR have better properties for the task at hand. That said, as you say, other filter types are valid, and since we can't in general know the context or check and warn for overall ideal properties, I agree on we don't need this warning, and can & probably should consider it up to the user to choose filters. We can still mention some recommendations in the tutorials. So, unless anyone disagrees, let's just drop this warning.

TomDonoghue commented 4 years ago

And some comments on timefrequency:

In hilbert.py, your instantaneous phase computation can generate a discontinuous phase-curve. This is mildly problematic for when you compute instantaneous frequency, as you're trying to differentiate a non-differentiable function. It looks like you fudge the instantaneous phase in the freq_by_time() function by adding 2*pi to all locations where the pairwise differences are negative. A more elegant way of addressing this would be to use np.unwrap(np.angle(robust_hilbert()) inside phase_by_time. This will give you a monotone increasing phase. Further, when a user plots instantaneous phase and instantaneous frequency, it will be clearer that one is the derivative of the other.

Thank you again for the example! As I understand this, there are two points here:

If I'm understanding, and we can separate these things, I think my first thoughts are:

If I'm understanding, and we try not to change phase_by_time, then the unwrapping could be done in freq_by_time, with the core computation being something like return np.diff(np.unwrap(phase_by_time(sig, ...))), right?. This replaces or voids needing at least some of the current 'hacks' in freq_by_time, I think?

Does this track? I think keeping a wrapped phase output is consistent with how people might use phase_by_time in neuroscience contexts, and means we don't have a potentially code-breaking change, but this still let's us fix up freq_by_time. Also soliciting any thoughts from @rdgao on this one.

Inside wavelets.py, calling compute_wavelet_transform() returns a time-frequency diagram where time is on the vertical axis and frequency is plotted on the horizontal axis. This goes against the conventions that I typically see when looking at time-frequency diagrams. It's usually the case that time is on the x-axis and frequency is on the y-axis. I think this is probably just an aesthetic point, but users who are accustomed to calling wavelet transforms like scipy.signal.cwt may be confused by this change of convention.

Yes, this is a good point. Although changing this would change the output, I think we can consider it as basically a bug to be returning like this, and I would vote for updating the shape organization of the returned array. I don't think our wavelets are used all that much, and this at least shouldn't break any other internal code to update, so I vote for changing.

elybrand commented 4 years ago

The only current implementation detail that comes to mind is that this update seems like it might have some downstream effects of requiring updates to how we manage filter coefficients in sub-functions such as compute_frequency_response (?). These seem like implementation level things, and I think can be dealt with in the PR.

This is a very good point. I'll have another look over to see if changing the return type from coefficients to sos would wreak havoc downstream. If it does, I'll punt that PR to @ryanhammonds.

If I'm understanding, and we try not to change phase_by_time, then the unwrapping could be done in freq_by_time, with the core computation being something like return np.diff(np.unwrap(phase_by_time(sig, ...))), right?. This replaces or voids needing at least some of the current 'hacks' in freq_by_time, I think?

If the return of phase_by_time is unchanged, then it's a pretty easy fix. The hacks in freq_by_time can be addressed by calling np.wrap. I will defer to your expertise on whether or not phase_by_time should be monotone or should wrap around. As you say, it may be the convention in neuroscience. I also don't know yet what changing the output does downstream for other applications.

rdgao commented 4 years ago

just hopping on with some potential clarifications:

But we should absolutely use unwrap() in the instantaneous freq calculation. I don't know why the hell it's so janky currently, probably Scott's doing.

elybrand commented 4 years ago

yes, I believe it was designed to be always odd. I don't actually know if there is a "technical" answer to this

I actually realized the answer to this as I was writing out this comment. It turns out to be twofold. First, having an odd number of coefficients means that the linear phase shift (due to symmetry) is an integer shift. This, as you say, does in fact let you find a map between time stamps of the filtered and unfiltered data. What's more, type II filters (even number of coefficients) are a poor choice for bandstop and high pass filters because their transfer functions have roots at the nyquist frequency z=-1 = e^{i pi}. Using a type II high pass filter on the signal [1, -1, 1, -1], a signal whose lone frequency is the nyquist frequency, would unfortunately give you [0,0,0,0]. Since NeuroDSP more or less treats all pass types the same, it makes sense to use Type I filters whose transfer functions do not have a zero at z=-1.

every resulting data point (and subsequent transforms, like the phase series) has a corresponding timestamp from the original time series. A lot of applications in neuroscience take these signals and do time-locked epoching or averaging

That makes sense. However, I just did a sanity check and I couldn't find anywhere in NeuroDSP that tries to correct for the phase shift with FIR filters. I spent a good amount of time trying to cook up signals which would break it and I couldn't. I went down this deep rabbit hole to see why the phase shift was visibly non-existent, and the best I can conjecture is that firwin uses scipy's minimum_phase function to design an FIR filter with minimal phase shift. My bet is that the NeuroDSP design choice of forcing the number of coefficients to be odd means that minimum_phase achieves zero phase. Whether firwin calls minimum_phase or not, the filtering is probably good enough for NeuroDSP, given how hard I had to work to try and break it.

in some cases pretend that the filtered signal was not literally just convolved with neighboring data points.

lol

elybrand commented 4 years ago

After reading through the burst submodule, everything seems fine to me. As for rhythm, I only have one concern.

I skimmed the Gips et al paper and it looks like the motifs are suppose to be disjoint, i.e. don't overlap. If this is the case, then I believe there is a problem with _find_new_window_idx. In that method, it randomly samples indices from the signal and checks whether the distance from the starting index of the window is at least spacing_n_samps large. You need to select from indices which are at least spacing_n_samps away from the entire window to ensure they are disjoint. Here's an example that breaks the code.

import numpy as np

np.random.seed(0)

def _find_new_window_idx(window_starts, spacing_n_samps, n_samp, tries_limit=1000):
    """Find a new sample for the starting window."""

    for n_try in range(tries_limit):

        # Generate a random sample & check how close it is to other window starts
        new_samp = np.random.randint(n_samp)
        dists = np.abs(window_starts - new_samp)

        if np.min(dists) > spacing_n_samps:
            break

    else:
        raise RuntimeError('SWM algorithm has difficulty finding a new window. \
                            Try increasing the spacing parameter.')

    return new_samp

# Current windows
# [0          9] 10 11 12 [13            22].
# There are no feasible moves for either window, assuming
# they must be disjoint.

len_sig = 23
window_length = 10
n_samp = len_sig - window_length
window_starts = np.array([0, 13])
spacing_n_samps = 3

idx = _find_new_window_idx(window_starts, spacing_n_samps, n_samp)
print(idx) # == 5

You'll have to feed in the window lengths to make sure you get a feasible index. You'll also have to compute the distance from a putative new window start index to the set of indices of each window.

TomDonoghue commented 4 years ago

Catching up on things here:

Filtering

Time-Frequency

Rhythm

What you point out generally makes sense to me, but overall I'm not too familiar with this algorithm.

elybrand commented 4 years ago

is this something relatively easy to update & fix?

At the very least, we'd have to write a function which would verify whether a proposed starting index is valid. That kind of function wouldn't be too difficult to write. Once we have that, you could use rejection sampling as is done now until you get a valid index or exceed the maximum number of iterations.

rdgao commented 4 years ago

not me, and no idea about the Gips algorithm, probably better to ask Natalie, tbh.