neuropsychology / NeuroKit

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
https://neuropsychology.github.io/NeuroKit
MIT License
1.58k stars 420 forks source link

`ecg_delineate` with `peak` method fails with `ValueError: cannot convert float NaN to integer` #532

Closed timvlaer closed 2 years ago

timvlaer commented 3 years ago

This is a template for reporting a bug. You can remove it and write from scratch. These sections are a rough guide, but the important thing is to give enough details so that the developers can reproduce the bug on their machine and then investigate.

Describe the bug See dataset attached, ecg_delineator fails with ValueError. I don't know what exactly is going on but two things seems crucial to trigger the bug:

See sample code below

/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_delineate.py in ecg_delineate(ecg_cleaned, rpeaks, sampling_rate, method, show, show_type, check)
    118     method = method.lower()  # remove capitalised letters
    119     if method in ["peak", "peaks", "derivative", "gradient"]:
--> 120         waves = _ecg_delineator_peak(ecg_cleaned, rpeaks=rpeaks, sampling_rate=sampling_rate)
    121     elif method in ["cwt", "continuous wavelet transform"]:
    122         waves = _ecg_delineator_cwt(ecg_cleaned, rpeaks=rpeaks, sampling_rate=sampling_rate)

/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_delineate.py in _ecg_delineator_peak(ecg, rpeaks, sampling_rate)
    685     except ImportError:
    686         raise ImportError(
--> 687             "NeuroKit error: ecg_delineator(): the 'PyWavelets' module is required for this method to run. ",
    688             "Please install it first (`pip install PyWavelets`).",
    689         )

/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_segment.py in ecg_segment(ecg_cleaned, rpeaks, sampling_rate, show)
     56         rpeaks=rpeaks, sampling_rate=sampling_rate, desired_length=len(ecg_cleaned)
     57     )
---> 58     heartbeats = epochs_create(
     59         ecg_cleaned, rpeaks, sampling_rate=sampling_rate, epochs_start=epochs_start, epochs_end=epochs_end
     60     )

/usr/local/lib/python3.9/site-packages/neurokit2/epochs/epochs_create.py in epochs_create(data, events, sampling_rate, epochs_start, epochs_end, event_labels, event_conditions, baseline_correction)
    121     # Find the maximum numbers of samples in an epoch
    122     parameters["duration"] = list(np.array(parameters["end"]) - np.array(parameters["start"]))
--> 123     epoch_max_duration = int(max((i * sampling_rate for i in parameters["duration"])))
    124 
    125     # Extend data by the max samples in epochs * NaN (to prevent non-complete data)

ValueError: cannot convert float NaN to integer

To Reproduce

import neurokit2 as nk

ecg = pd.read_csv(sample_ecg.csv')['channel'].values
clean_ecg = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
rp, rpeaks = nk.ecg_peaks(clean_ecg, sampling_rate=250)
wv, waves_peak = nk.ecg_delineate(clean_ecg, rpeaks["ECG_R_Peaks"], sampling_rate=250, show=True, method="peak")

Expected behaviour Expected array with detected waves but instead fails with ValueError

System Specifications

It's important that you give us some information about the system you are using. For that you can run:

>>> nk.version()
- OS: Darwin ( 64bit) 
- Python: 3.9.6 
- NeuroKit2: 0.1.2 

- NumPy: 1.19.5 
- Pandas: 1.2.3 
- SciPy: 1.6.1 
- sklearn: 0.24.2 
- matplotlib: 3.4.2

sample_ecg.csv

DominiqueMakowski commented 3 years ago

Hi, can you update neurokit to the latest version (1.4.1) and try again? Thanks

Tam-Pham commented 3 years ago

Hi @timvlaer

I was able to replicate your errors with the latest version of NeuroKit. However, a particular problem that I noticed is in the ECG signal you provided.

import neurokit2 as nk

ecg = pd.read_csv('sample_ecg.csv')['channel'].values
clean_ecg_engz = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
clean_ecg_default = nk.ecg_clean(ecg, sampling_rate=250)

import matplotlib.pyplot as plt
fig= plt.figure()
plt.plot(ecg, label = 'raw')
plt.plot(clean_ecg_engz, label = 'engzeemod2012 cleaning')
plt.plot(clean_ecg_default, label = 'neurokit cleaning')
plt.legend()

image

The ECG signal here is simply too noisy to be cleaned as you can see in the outputs of both engzeemod2012 and neurokit methods.

clean_ecg_engz = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
rp, rpeaks_engz = nk.ecg_peaks(clean_ecg_engz, sampling_rate=250)

Note that the input of rpeaks_engz causes all delineate methods to fail as the problem lies upstream. Only 3 rpeaks were detected and they are too few to detect reliable heart rate and thus the nk.ecg_segment() returns Nans. The problem here is that the rpeaks detected might not be at all reliable, looking at the state of the signal.

Even though using the neurokit cleaning method might not cause an error in the delineation, I don't think the output can be used reliably.

clean_ecg_default = nk.ecg_clean(ecg, sampling_rate=250)
rp, rpeaks_default = nk.ecg_peaks(clean_ecg_default, sampling_rate=250)
wv, waves_peak = nk.ecg_delineate(clean_ecg_default, rpeaks_default["ECG_R_Peaks"], sampling_rate=250, show=True, method="peak")

image

timvlaer commented 3 years ago

Hi @Tam-Pham , thanks for reviewing. I share your thoughts, the signal is indeed complete rubbish. I was supprised to see the code crash on this particular example while I did expect an empty result in this case (no waves at all). Does that reasoning makes sense?

I'm reporting this bug to make the internals of the library robust against these weird cases. I'm fine with closing this ticket as 'won't fix' as this is a corner case.

I will definitely check the signals before trying to find peaks with Neurokit.

eliasalzaghrini commented 2 years ago

Hello, same issue is happening with me

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had recent activity. It will eventually be closed if no further activity occurs.

stale[bot] commented 2 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.