neuropsychology / NeuroKit

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
https://neuropsychology.github.io/NeuroKit
MIT License
1.53k stars 410 forks source link

New rsp_plot() parameter for desired segment of signal. #315

Closed Mitchellb16 closed 3 years ago

Mitchellb16 commented 4 years ago

As is, rsp_plot() and perhaps the other plotting functions, will only plot the entire signal. This is good for relatively short signals, but I think it would also be nice to provide users the option to pick out a section of the signal that is of interest to them to plot.

For example: If I had a 1 minute recording with a 1000Hz sampling rate but was interested in just taking a quick look at what is happening around 30-35 seconds, I might try

nk.rsp_plot(signal.loc[30000:35000], sampling_rate = 1000)

but get and error similar to this in response: KeyError: "None of [Int64Index([5096, 18985, 40736, 58053, 65539, 73859, 90226], dtype='int64')] are in the [index]"

Which I believe occurs because ax.scatter is looking for peaks that are not provided in the segmented part of the signal.

How could we do it? I believe we could make a change to this line: x_axis = np.linspace(0, len(rsp_signals) / sampling_rate, len(rsp_signals))

Which I believe always assumes we want to start at the beginning of the signal. Perhaps we could add a parameter that could change the "0" to the first element in a tuple and len(rsp_signals) to the last? There may be a more elegant solution but I think this would be a helpful feature.

DominiqueMakowski commented 4 years ago

Totally agree, I myself stumbled across this issue several times 😬

I think the fix should be made in all the main *_plot() functions (eda, ecg, rap, emg...) as they all use the same kind of code.

Would you want to open a PR with this?

Mitchellb16 commented 4 years ago

Certainly! I'll start with just rsp_plot() as that's what I'm most familiar with.

Do you think going with the times as a parameter is the way to go? Or making changes that would allow .loc[] to work?

JanCBrammer commented 4 years ago
import neurokit2 as nk

rsp = nk.rsp_simulate(duration=90, respiratory_rate=15)
rsp_signals, info = nk.rsp_process(rsp, sampling_rate=1000)
fig = nk.rsp_plot(rsp_signals.loc[10:10000])

Works for me. @Mitchellb16 , @DominiqueMakowski, could you try this on your machines?

JanCBrammer commented 4 years ago

Ok, when doing

fig = nk.rsp_plot(rsp_signals.loc[80000:89000])

I also get

KeyError: "None of [Int64Index([2013, 5610], dtype='int64')] are in the [index]"

JanCBrammer commented 4 years ago

The solution seems to be to call reset_index() on the DataFrame after slicing it:

fig = nk.rsp_plot(rsp_signals.loc[80000:89000].reset_index())

The problem doesn't seem to be the x_axis https://github.com/neuropsychology/NeuroKit/blob/cb37d83ee20d6a13a91c4848aa435f41e979e203/neurokit2/rsp/rsp_plot.py#L49-L54

but the fact that numpy.where() resets the index

https://github.com/neuropsychology/NeuroKit/blob/cb37d83ee20d6a13a91c4848aa435f41e979e203/neurokit2/rsp/rsp_plot.py#L37-L41

which breaks the indexing of the DataFrame https://github.com/neuropsychology/NeuroKit/blob/cb37d83ee20d6a13a91c4848aa435f41e979e203/neurokit2/rsp/rsp_plot.py#L65-L68

JanCBrammer commented 4 years ago

So this seems like a matter of documentation rather than something that requires changes to the code?

DominiqueMakowski commented 4 years ago

fig = nk.rsp_plot(rsp_signals.loc[80000:89000])

Yeah, this workaround works, though it would be nice if this edge case could be dealt with internally, mainly to smooth the user experience and more importantly to slice the x-axis accordingly.

Let's say you got 20 s of signals, and you want to display the 10 - 15 s range, if we reset the index then the plot shows an x axis from 0 to 5, whether it'd be nicer if it would preserve the original axis and display 10-15.

Without looking in, we could first copy the index of the data when it's passed (to use later as the x-axis) and then reset-it so that the rest goes smoothly.

JanCBrammer commented 4 years ago

it'd be nicer if it would preserve the original axis and display 10-15

Agreed.

we could first copy the index of the data when it's passed (to use later as the x-axis) and then reset-it

That seems like a good way to go.

Although then the user looses explicit control (what if they already modified the indices? etc..)

DominiqueMakowski commented 4 years ago

Although then the user looses explicit control (what if they already modified the indices? etc..)

If that create some edge-cases we can see if we can accommodate them, but I think in the majority of cases it will improve user-experience rather than decrease it. And users that will start doing some fancy manipulation themselves are usually able to plot what they want to plot without the need for our cute functions 😁

JanCBrammer commented 4 years ago

in the majority of cases it will improve user-experience rather than decrease it. And users that will start doing some fancy manipulation themselves are usually able to plot what they want to plot without the need for our cute functions 😁

Fair enough.

stale[bot] commented 3 years ago

This issue has been automatically marked as inactive because it has not had recent activity. It will eventually be closed if no further activity occurs.

stale[bot] commented 3 years ago

This issue has been inactive for a long time. We're closing it (but feel free to reopen it if need be).

CarinaFo commented 2 months ago

Hello everyone,

I still run into the mentioned issue:

"None of [Index([398, 1611, 3419, 6787], dtype='int64')] are in the [index]"

`

    # Clean the respiration signal
    df_resp, _ = nk.rsp_process(raw_resp.get_data().squeeze(), sampling_rate=raw_resp.info['sfreq'])

    # Randomly select 8000 rows from the DataFrame
    sampled_data = df_resp.sample(n=8000)

    nk.rsp_plot(sampled_data, sampling_rate=500)`

I get the same error using ecg_plot()

Only visualizing parts of the signal is crucial for me as I have a 50-minute respiration and ECG recording.

I am not sure if this is related but I also get an error about the keyword sampling rate, which according to the docs should be a bug:

{ "name": "TypeError", "message": "rsp_plot() got an unexpected keyword argument 'sampling_rate'", "stack": "--------------------------------------------------------------------------- TypeError Traceback (most recent call last) File z:\expecophysio\code\Carina\respiration_analysis.py:1 ----> 1 nk.rsp_plot(sampled_data, sampling_rate=500)

TypeError: rsp_plot() got an unexpected keyword argument 'sampling_rate'" }

Best,

Carina

Mitchellb16 commented 2 months ago

Hi Carina, I believe the nk.rsp_plot() function is made for plotting continuous timeseries data. When you try to plot the randomly sampled data, the function generates continuous indexes to use as the x axis values (either time or index depending on whether you supply sampling_rate).

If you want to look at a chunk of your signal, I'd suggest the following:

    # Clean the respiration signal
    df_resp, _ = nk.rsp_process(raw_resp.get_data().squeeze(), sampling_rate=raw_resp.info['sfreq'])

    # Get a chunk of signal from 20s to 30s
    sampling_rate = raw_resp.info['sfreq']
    start_time = sampling_rate * 20
    end_time = sampling_rate * 30
    signal_chunk = df_resp.loc[start_time:end_time, : ].reset_index()

    # plot
    nk.rsp_plot(signal_chunk)