cbrnr / sleepecg

Sleep stage detection using ECG
BSD 3-Clause "New" or "Revised" License
103 stars 25 forks source link

ECG peak detection gets stuck when data is very noisy #141

Open raphaelvallat opened 1 year ago

raphaelvallat commented 1 year ago

Hi,

This is an issue I have encountered a few times already — when the ECG data is very noisy for either part of or most of the recording, the detect_heartbeats function just gets stuck forever. This is problematic when you run peak detection across hundreds of participants and cannot visually check each recording.

To reproduce the error:

Download the EDF file with 8 hours ECG: https://drive.google.com/file/d/1ReYlFPAd3-dYk0C2WF7nDRQS3MxE2lyE/view?usp=share_link

import mne
from sleepecg import detect_heartbeats

# Load ECG
raw = mne.io.read_raw_edf("original_ecg.EDF", preload=True, verbose=False)
ecg = raw.get_data(units="uV")[0]
sf = raw.info["sfreq"]# Load ECG

# Peak detection
detect_heartbeats(ecg, sf) # gets stuck

In that case, the ECG data is so bad that honestly it would be better not to attempt the peak detection at all...!

Potential solutions:

  1. Modification of the algorithm to avoid getting stuck when no peak is found?
  2. Pre-screening of the ECG signal quality — if the data is very noisy then the detection is skipped and an empty array is returned instead? For example we could use some kind of ECG signal quality metric (see method="zhao2018" in https://neuropsychology.github.io/NeuroKit/functions/ecg.html#ecg-quality)

    Thanks, Raphael

cbrnr commented 1 year ago

Hi @raphaelvallat! This is probably the biggest issue with our R peak detector, and I actually suggest that we implement both solutions you mention.

Re 1, I assume that the algorithm keeps lowering the thresholds in cases where there are no peaks in the data. Although I thought that we limited this to a fixed number, maybe I'm wrong or there is a problem in our implementation. Can you specify where the detector gets stuck when it tries to find peaks in such bad data?

I am also not sure how easy it is to just skip a detection, we'd have to give it a try. Naively, I'd stop finding peaks after a certain number of tries (i.e. a certain number of times the thresholds were lowered) and go to the next segment. But since segments are defined by peaks, this might require some changes.

Re 2, does this method assess signal quality continuously? Or do you get one evaluation score for the entire signal? We could think about adding a mask parameter, which tells the algorithm which segments are valid and which segments should be ignored. Again, this will require some adaptations to make the detector work with such kind of data.

raphaelvallat commented 1 year ago

@cbrnr agree re: implementing both methods 👍

does this method assess signal quality continuously?

Yes, method="zhao2018" produces a single signal quality estimation for the entire recording. By contrast, the default approach in Neurokit2 (method="averageQRS") returns one value per sample.

Can you specify where the detector gets stuck when it tries to find peaks in such bad data?

I'll clone the repo and run some tests when I have some time. I actually don't know how the algo works in details so it'll be good to do a deep dive (of the pure Python implementation at least).