ppsp-team / HyPyP

The Hyperscanning Python Pipeline
BSD 3-Clause "New" or "Revised" License
75 stars 42 forks source link

analyse.compute_freq_bands filtering step "RuntimeWarning: filter_length" #161

Open micmar15sr opened 1 year ago

micmar15sr commented 1 year ago

Hello, thank you for your project, I'm using the getting_started to try to analyze data that I have recorded for my studies. I have prefiltered with a bandpass filter then a notch and removed the EOG of my signal before starting the pipeline, with the MNE library. Then I epocated my 120 seconds signal with 1 second width windows. When I apply the analyse.compute_freq_bands i get this warning.

RuntimeWarning: filter_length (1651) is longer than the signal (251), distortion is likely. Reduce filter length or filter a longer signal.
  filtered = np.array([mne.filter.filter_data(data[participant]  

that is because my 1 second windows has 251 samples due to the 250 Hz samplign. In the original paper I tried to see if it mandatory the filtering step applied by the compute_freq_bands function at line 696

 for freq_band in freq_bands.values():
        filtered = np.array([mne.filter.filter_data(data[participant],
                                                    sampling_rate, l_freq=freq_band[0], h_freq=freq_band[1],
                                                    **filter_options,
                                                    verbose=False)

I am wondering if this step is mandatory even if I have prefiltered my signal, in case which are the default parameter? If it is not mandatory what about adding a Bolean "filter" option to avoid the filtering before the hilbert trasform? thank you so much for you work, best regards Marcello Miceli

deep-introspection commented 1 year ago

Dear Marcello, Thanks for your message and interest in HyPyP! If you have filtered in advance your whole signal, this step is indeed not necessary. Do not hesitate to suggest a way to handle this use case. Cheers!

micmar15sr commented 1 year ago

Sorry if this is not the proper way, it's the first time I suggest a review of the code. I would suggest adding a boolean variable, default True, letting the user to chose whether or not filtering the signal. In case the variable is false the signal is just assigned to the variable filter as it is. Here the modified version of the method.


def compute_freq_bands(data: np.ndarray, sampling_rate: int, , filter_signal: bool = True,freq_bands: dict, **filter_options) -> np.ndarray:
    """
    Computes analytic signal per frequency band using FIR filtering
    and Hilbert transform.

    Arguments:
        data:
            shape is (2, n_epochs, n_channels, n_times)
            real-valued data to compute analytic signal from.
        sampling_rate:
            sampling rate.
        freq_bands:
            a dictionary specifying frequency band labels and corresponding frequency ranges
            e.g. {'alpha':[8,12],'beta':[12,20]} indicates that computations are performed over two frequency bands: 8-12 Hz for the alpha band and 12-20 Hz for the beta band.
     filter_signal
            bool: wheter to apply a filter on the signal,
            default, true
        **filter_options:
            additional arguments for mne.filter.filter_data, such as filter_length, l_trans_bandwidth, h_trans_bandwidth
    Returns:
        complex_signal: array, shape is
            (2, n_epochs, n_channels, n_freq_bands, n_times)
    """
    assert data[0].shape[0] == data[1].shape[0], "Two data streams should have the same number of trials."
    data = np.array(data)

    # filtering and hilbert transform
    complex_signal = []
    for freq_band in freq_bands.values():
        if filter_signal: ######to decide whether to filter the signal   
            filtered = np.array([mne.filter.filter_data(data[participant],
                                                    sampling_rate, l_freq=freq_band[0], h_freq=freq_band[1],
                                                    **filter_options,
                                                    verbose=False)
                             for participant in range(2)
                             # for each participant
                             ])
        else: # in case just pass the signal as it is 
            filtered=np.array([data[partecipant] for partecipant in range(2))
        hilb = signal.hilbert(filtered)
        complex_signal.append(hilb)

    complex_signal = np.moveaxis(np.array(complex_signal), [0], [3])

    return complex_signal

This solution, to me, lets the user choose if or not to apply a filter without modifying the actual behavior of the code for retro compatibility with previous pipelines. I'm not sure about the else: assignment, if the data are handled in a proper way. I hope this would help, best regards.

deep-introspection commented 1 year ago

This looks great! Can you make a pull request please? This will help keep track of the authors of all the edits and give you credit on your contribution.