paulvangentcom / heartrate_analysis_python

Python Heart Rate Analysis Package, for both PPG and ECG signals
MIT License
947 stars 321 forks source link

Results are depending on DC component of the signal #80

Open meierman1 opened 2 years ago

meierman1 commented 2 years ago

Issue The results of process() and more specifically of fit_peaks() depend on the DC component of the signal, which, IMO, does not make sense. To showcase the issue, you can use any data, for which the analysis works well and add an offset to the signal. When the mean of the signal surpasses a range of approx. 15-30 times the std of the signal, the analysis will fail. More importantly: The quality of the results is affected by much smaller changes in offset already.

Cause (as far as I can tell) At least partially, this is caused by detect_peaks() where different moving averages are tested. The candidates are a percentage of the the absolute moving average which means, the tested values are more coarse for high-mean signals.

Proposed fix I propose using the signal mean and add the same offset percentages as before but of 3*std(signal) instead of the signal mean itself to the signal. Alternatively, the signal mean could always be set to a fixed value as it is already being done for signals with negative baselines (line 280 in heartpy.py).

I have to add, that I experienced something odd when testing all of this: When using a zero-mean signal (which is internally offset to a zero baseline as mentioned above), I get different results compared to when I use the very same signal and add the same offset abs(np.percentile(hrdata, 0.1)) to it before calling the process function. Have not had the time to find the root cause of this.

Edit: I actually did some more testing. I would recommend using the offset percentages on just std(hrdata). 3*std(hrdata) does not make sense as peaks are usually well within plus/minus 2*std(hrdata) and we test up to 300 percent anyways. In fact I even improved results for my data by adding the following lower/negative percentages to ma_perc_list: [-5, -3, -2, -1, -0.5, 0, 0.5, 1, 2, ...]. At least in my tests, the best percentage was never over 15 and usually within plus/minus 1 percent.

paulvangentcom commented 2 years ago

Thanks for the detective work and effort. I'll go through the PR somewhere today or tomorrow and merge if it makes sense!