neuropsychology / NeuroKit

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing
https://neuropsychology.github.io/NeuroKit
MIT License
1.55k stars 413 forks source link

Interesting ideas and algorithms from the Bio-SP tool (matlab toolbox) by Nabian et al. (2018) #312

Closed DominiqueMakowski closed 3 years ago

DominiqueMakowski commented 4 years ago

Very interesting open-access paper and toolbox by Nabian et al. (2018), with a similar scope to NK (but for matlab). The paper is very well written and the algorithms and methods are clearly described. It contains some interesting features ideas and algorithms that we could maybe implement to facilitate cross-language comparisons.

If a given method is specific to this package (no prior publication), we can name it *_function_nabian2018() or _function_biosp(), and of course additionally reference and link this paper (and the original reference if applicable).

ECG

Step I: Since the human heart rate does not exceed 5 beats per second two adjacent QRS complexes cannot appear closer together than 200ms [14]. Therefore, a sliding window of 400 ms size with a step size of 1 sample can iteratively scan the entire signal. At each iteration, we find the global maximum in the sliding window and if this global maximum appears in the middle of the window, that global maximum is marked as a potential R point. By scanning the entire input ECG signal, the peaks which are likely to be R peaks are detected. The ith peak is denoted by Ri.

Step II: Ri peaks that are lower in amplitude than an amplitude threshold (Amp-Thr) are eliminated. Amp-Thr is initially set to 1/3 times the maximum amplitude of the first 2 seconds of the input ECG signal. To adapt to the changes in the ECG over time, upon passing sufficient R peaks from the beginning of the signal, Amp-Thr is updated to 0.75 times the mean of the amplitude of the last eight determined R peaks.

Step III: After scanning through all the detected adjacent R peak to R peak time intervals (R-R intervals), if any R-R interval is greater than a predefined threshold (RR-Thr), it implies that an R peak is missing. Thus, we assign a new R peak as the global maximum amplitude of the signal within the range of Ri+200ms and Ri+1-200ms [14]. According to [14], the rate of change for adjacent R-R intervals will not exceed 166%. Thus, RR-Thr is initially set to the previous R-R interval multiplied by 1.66. Upon detecting enough R-R intervals, the RR-Thr is then updated to reflect the average of the last eight R-R intervals multiplied by 1.66.

Step IV: Detection of P, Q, S, and T points are based on the locations of the detected R peaks. Qi is identified as the first closest local minimum before Ri and within the interval of Ri-70ms and Ri. Likewise, Siis identified as the first closest local minimum after Ri and within the interval of Ri and Ri+70ms. The peak of the P wave, Pi, is identified as the global maximum point found in the interval of Ri−120ms to Ri, and the peak of the T wave, Ti, is identified as the global maximum point found in the interval of Ri to Ri+300ms.

EDR is calculated based on the area of each normal QRS complex measured over a fixed window width w, which is determined by the interval from the PQ junction to the J(junction)-point which is the onset of repolarization phase on the ECG signal. We estimated w as two times the sum of the QR interval and the RS interval (w = 2×(QR+RS)).

QR to QS ratio and RS to QS ratio, features used in cardiology, are defined as the ratio of the QR interval to the QS interval, and the ratio of the RS interval to the QS interval for each R peak, respectively.

A rule-based algorithm proposed by Moody [58] was used in Bio-SP tool for ECG data quality checking. This algorithm achieved one of the highest accuracy scores in the Physionet/CinC Challenge [57]. The algorithm inspects segments of the signals based on the following three criteria: (1) If the signal is flat, it suggests that the electrodes are not well attached to the skin. Mathematically speaking, if the standard deviation of a segment of a signal is very close to zero, that segment is a flat signal. (2) The minimum, maximum, and the maximum-minimum difference values of the signal should be within an acceptable range. Values that are too small or too large also may result from improper electrode attachment or from large noise spikes in the signal [58]. (3) The ECG signal remains close to a zero voltage (i.e., the isoelectric line) during much of the overall signal except during the relatively short period between the P and T waves, when the electrical activity associated with the heartbeat occurs. Therefore, the duration between these larger voltage changes should be within a specific range for high-quality signals. Specifically, given a signal segment (with the size of larger than one heartbeat), the difference between the global maximum and global minimum of its sub-segments as a random variable makes a highly positively skewed distribution. We measured the skewness using the definition provided by Groeneveld and Meeden [60]. This algorithm detects segments of the signal that are not similar to the specific shape features of a prototypical ECG signal. If a segment of the signal fails to meet any of these criteria, that segment will be marked as poor quality. Because ECG, dZ/dt, and continuous BP signals contain similar frequencies and patterns related to cardiac activity, we tuned the threshold parameters used for rules (1)–(3) above for ECG, and applied similar quality checking algorithms to the dZ/dt and BP biosignals.

EDA

SCRs contained within the EDA signal can be detected by performing differentiation and subsequent convolution with a 20-point Bartlett window [20]. A Bartlett window is a triangle function represented as (3) where N is the window size (=20) and n is the nth signal sample in the window:

image

The occurrence of a SCR is detected by finding two consecutive zero-crossings, from negative to positive and positive to negative of the bartletted differentiated EDA signal (see Fig. 4). We considered negative to positive as the beginning and positive to negative as the end of each SCR. The amplitude of the SCR is obtained by finding the maximum value between these two zero-crossings, and calculating the difference between the initial zero crossing and the maximum value. Detected SCRs with amplitudes smaller than 10 percent of the maximum SCR amplitudes that are already detected on the differentiated signal will be eliminated.

Using rules documented in [60] and [61], we determined invalid or poor quality data segments in the EDA signal within Bio-SP tool as follows: (1) The minimum or maximum values of the EDA signal exceed a user-specified acceptable range. (2) Changes in the EDA signal occur at a higher frequency than is physiologically plausible (e.g., faster than ±10 µS/s), where µS is microSiemens. (3) EDA signal epochs that are within 5 s of invalid portions from rules (1) and (2) are also marked as invalid.

PPG/BP

The low-pass filter frequency for continuous BP signal preprocessing should be at least 10 times higher than the heart rate and less than half of the sampling rate [43]. Since a typical resting human heart rate varies between 40 and 100 beats per minute (bpm) [44], a low-pass filter with a cut-off frequency of 40 Hz is often used for removal of high frequency artifacts in BP signals [43].

An algorithm implemented by Laurin [21] for the feature extraction of arterial blood pressure is used in this work [to] detect the characteristic points in the BP signal including systolic BP, diastolic BP (Foot), the dicrotic notch and the dicrotic peak (see Fig. 5). The Foot, or lowest level of BP for a given heartbeat, is defined as the BP at the time of occurrence of the maximum point in the second derivative of the BP time-series for each heartbeat. The systolic point is defined as the maximum value of the BP waveform within the time range of [0s - 0.125s] after the Foot index [21]. The dicrotic notch, which occurs due to a disruption in the flow of blood as the aortic valve closes, is defined as the BP at the minimum of the subtraction of the signal and the straight line going from systole to Foot. Finally, the dicrotic peak is defined as the BP at the minimum of the second derivative of the time-series following the dicrotic notch, within a window of size RRm/5 s above it where RRm is the median heartbeat interval computed from the Foot points.

Mean diastolic pressure (Foot) is the mean of all the signal values (pressures) at the Foot indices for a given epoch. Mean systolic pressure is the mean of all the signal values (pressures) at the systolic indices for a given epoch. Mean dicrotic peak pressure is the mean of all the signal values (pressures) at the dicrotic peak for a given epoch. Mean dicrotic notch pressure is the mean of all the signal values (pressures) at the dicrotic notch for a given epoch. Mean arterial pressure (MAP) represents mean signal value (in pressure) for the entire input BP signal for a given epoch.

stale[bot] commented 3 years ago

This issue has been automatically marked as inactive because it has not had recent activity. It will eventually be closed if no further activity occurs.

stale[bot] commented 3 years ago

This issue has been inactive for a long time. We're closing it (but feel free to reopen it if need be).