neuropsychology / NeuroKit

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
https://neuropsychology.github.io/NeuroKit
MIT License
1.46k stars 397 forks source link

Documentation of rsp_clean() seems inconsistent with implementation #950

Open danibene opened 5 months ago

danibene commented 5 months ago

Hello! I am a bit confused about the default method for cleaning respiration signals.

In the docstring, it is written that there is linear detrending followed by a 5th order lowpass Butterworth filter with a cutoff of 2Hz: https://github.com/neuropsychology/NeuroKit/blob/3d004b4777c3e59b288f46b9f0f812ac1e04b5d4/neurokit2/rsp/rsp_clean.py#L18-L19

But the implementation seems to just be a 2nd order bandpass Butterworth filter with cutoffs of 0.5 Hz and 3 Hz: https://github.com/neuropsychology/NeuroKit/blob/3d004b4777c3e59b288f46b9f0f812ac1e04b5d4/neurokit2/rsp/rsp_clean.py#L121-L143

Also, in the cited paper, Khodadad et al. (2018) wrote "However, further improvement is also obtained by pre-processing the data using a digital high-pass filter to remove the dominating low frequency contents. Here, a second order high-pass Butterworth filter has been used with a cut-off frequency of 15 breaths/min." Wouldn't that correspond to 0.25 Hz?

Should we update the documentation to reflect that it is a 2nd order bandpass Butterworth filter with cutoffs of 0.5 Hz and 3 Hz?

DominiqueMakowski commented 5 months ago

I think we should adjust the method to match the paper if there's a discrepancy no?

danibene commented 5 months ago

Seems like the discrepancy between the docstring and the implementation started here:

https://github.com/neuropsychology/NeuroKit/commit/d9470093b1a4ae8f8fdd5f57b5f801821171c366

Any idea why that was done? Maybe another option could be to have a default "neurokit" method (in case this implementation does work better) as well as the original implementation based on the one from the paper

DominiqueMakowski commented 5 months ago

Any idea why that was done?

indeed that's strange and unfortunately the committer disappeared from GH it seems. I think the idea was to replace the detrending by the highpass, but...

in case this implementation does work better

The problem is how to benchmark that, I ran a quick search for annotated RSP data just to have some empirical evidence and found this dataset:

https://www.physionet.org/content/bidmc/1.0.0/

That means we could maybe try to compare the time of peaks&troughts after various cleaning procedures vs. their annotations?

And then, if the current version is better, we move it to a new neurokit default method. If not, we drop it. What do you think?


Also they describe another method here based on Lu2006 that I don't think we implement:

https://physiodatatoolbox.leidenuniv.nl/docs/user-guide/physioanalyzer-modules/resp-module.html

danibene commented 5 months ago

Making the decision based on its performance on that dataset sounds good! I just don't know when I would be able to do that (though you or any kind stranger lurking here is welcome to), so my proposal would be:

  1. PR changing the docstring to reflect the current implementation, mentioning this issue & that it currently doesn't reflect the implementation in the paper (so that the default functionality isn't changed before we check its performance)

  2. PR downloading and processing the open-access dataset

  3. PR adding study comparing methods

  4. PR changing or adding implementation depending on results

What do you think?

DominiqueMakowski commented 5 months ago

sounds good! 👌