paulvangentcom / heartrate_analysis_python

Python Heart Rate Analysis Package, for both PPG and ECG signals
MIT License
964 stars 323 forks source link

dfitpack error from process_segmentwise #33

Closed sedurCode closed 5 years ago

sedurCode commented 5 years ago

When using heartpy v1.2.4 with python v3.7.4 on Windows 64bit to analyse heart rate from PPG data with the hp.process_segmentwise method I get the following error from the scipy interpolate lib: dfitpack.error: (m>k) failed for hidden m: fpcurf0:m=3 Following a debug trace, the error occurs from the call of calculate_fd_measures in analysis.py line 426: interpolated_func = UnivariateSpline(rr_x, rr_list, k=3) which in turn calls where the error originates in scipy: data = dfitpack.fpcurf0(x, y, k, w=w, xb=bbox[0], xe=bbox[1], s=s) In the case that the error might be caused by the same issues as the following thread: https://stackoverflow.com/questions/32230362/python-interpolate-univariatespline-package-error-mk-failed-for-hidden-m

I have checked the the lengxth of x is 3, the length of y is 3 and k = 3, w bbox[0] bbox[1] and s = None

The value of x is image

The value of rr_list is: image

Reducing the order of the spline to 2 returns successfully, but 3 fails in this case. According to the linked stackoverflow, k in this case needs to be at least one bigger than the length of x, so it would appear that the mechanism which creates RR_list_cor in the working_data dict is causing an error later in the chain.

paulvangentcom commented 5 years ago

Strange, can you provide me the data segment that reproduces this error?

sedurCode commented 5 years ago

How would you like me to give you the data?

paulvangentcom commented 5 years ago

You can upload it here, or if it is privacy sensitive you can send it to me at p.vangent@tudelft.nl

thanks!

sedurCode commented 5 years ago

I will prepare some data and get it to you by mid day

sedurCode commented 5 years ago

I also get the same or another error in the same function call, if I reduce the segment time length to something like 5 or 10.

segment_width=20: dfitpack.error: (m>k) failed for hidden m: fpcurf0:m=3

segment_width=10: dfitpack.error: (m>k) failed for hidden m: fpcurf0:m=3

segment_width=5: File ".../main.py", line 12, in <module> working_data, measures = hp.process_segmentwise(matlabdata, sample_rate=fs, segment_width=5, segment_overlap = 0.25, calc_freq=True, reject_segmentwise=True, report_time=True) File "..../heartpy/heartpy.py", line 375, in process_segmentwise working_data, measures = process(hrdata[i:ii], sample_rate, **kwargs) File "...../heartpy/heartpy.py", line 257, in process working_data = working_data) File "..../heartpy/analysis.py", line 425, in calc_fd_measures rr_x_new = np.linspace(rr_x[0], rr_x[-1], rr_x[-1]) IndexError: list index out of range

In this case it appears that rr_list is empty

Could it be that you just need to add some error handling in the calc_fd_measures function that tests the shape of rr_x and rr_list then gracefully skip over frames that are the incorrect size? Maybe also try/catching for a return of an error from the linspace or UniversalSpline calls, such that you can step over data that is too poor for further analysis?

It also seems strange that I get a complete return from calling working_data, measures = hp.process(matlabdata, fs) on this data, but not on working_data, measures = hp.process_segmentwise(matlabdata, sample_rate=fs, segment_width=5, segment_overlap = 0.25, calc_freq=True, reject_segmentwise=True, report_time=True)

paulvangentcom commented 5 years ago

Seems I need to raise an explicit error there. Indeed if the window is too short there's always the risk of no peaks being detected, at which point frequency computations fail.

For computing frequency spectra and HRV it is recommended to have signals of a few minutes each minimum so the thought hadn't crossed my mind. I'll put it on the list.

paulvangentcom commented 5 years ago

I'll probably have time to address this Sunday, very maybe tonight.

Cheers

sedurCode commented 5 years ago

Seems I need to raise an explicit error there. Indeed if the window is too short there's always the risk of no peaks being detected, at which point frequency computations fail.

For computing frequency spectra and HRV it is recommended to have signals of a few minutes each minimum so the thought hadn't crossed my mind. I'll put it on the list.

The data I am working with at the moment is roughly half an hour long, but ideally I need quite fine temporal resolution to analyse short term responses. With the inconsistencies in output from ppg signals, it seems un-subtle to purely error out when the algorithm doesn't have data of good-enough quality to continue for any one frame. If the sensor fails for a few seconds and then starts to give good data again, it would be nice to get the good results out and return something like -1 when the input data is not good enough for analysis for a frame. But it is your tool, so it is also your choice.

paulvangentcom commented 5 years ago

You're right I'll have it output something (-1, np.nan, let's see) wherever things are not available.

Regarding your data: you might consider a larger window size and larger overlap factor if you need fine temporal resolution. HRV measures become meaningless if you have too little data to work with. With short window sizes they're mostly capturing breathing related variance, rather than other things you might be interested in.

What are you using the data for?

Cheers

sedurCode commented 5 years ago

Hi Paul, I am doing a reaction evaluation study, where I am comparing physiological responses and affective ratings of external stimuli.

paulvangentcom commented 5 years ago

You need to go up in segment_width to at least 25 seconds, ideally more. If you're using frequency measures, the LF-band is 0.04 - 0.15Hz. You'd need at least one full period if you want to find 0.04Hz amplitude. You want more than one full period if you're going to get a reliable output.

If you need shorter intervals, don't use the frequency measures but go with HRV measures like the RMSSD. Some authors have suggested as low as 10 seconds for that measure.

paulvangentcom commented 5 years ago

Fix for the issue is up: ffdcce948826f07052ef212ba86867a2a6308297

frequency domain measures will be np.nan if there's less than two peak-peak intervals.

Related to above: if there's less than 25 seconds of peak-peak intervals a UserWarning is issued (once, even if problem occurs mutliple times) detailing this.

sedurCode commented 5 years ago

Great, cheers Paul