MIT-LCP / wfdb-python

Native Python WFDB package
MIT License
730 stars 298 forks source link

Correct sampto behaviour using rdsamp and rdann? #470

Open kevindlau opened 7 months ago

kevindlau commented 7 months ago

When reading the signal with rdsamp and the corresponding annotations with rdann using the same sampto value results in an IndexError when accessing the signal array it includes an annotation outside the array.

In the example below, to read the first 100 entries, sampto must be 100. If using the same value for the annotations up to and including 100 are read, rather than stopping at 99 (i.e., the 100th entry in a 0-indexed array).

Is this the expected behaviour?

Using: python 3.9.18 with wfdb 4.1.2

import numpy as np
import wfdb

# create test data
n = 250
test_sig = np.zeros((n, 1))
test_sig[:, 0] = np.linspace(0, 1, n)
print(f'original shape: {test_sig.shape}')

# write to file
wfdb.wrsamp('test', fs = 250, units=['mV'], sig_name=['test'], p_signal=test_sig, fmt=['16'])

# create annotation and write
wfdb.Annotation(
    record_name='test', 
    extension='atr',
    sample=np.array([0, 99, 100, 249]), 
    symbol=['N', 'N', 'N', 'N']
).wrann()

samp_from = 0
samp_to = 100

# read values
values, _ = wfdb.rdsamp('test', sampfrom=samp_from, sampto=samp_to)
print(f'sampto shape: {values.shape}')

atr_1 = wfdb.rdann('test', extension='atr', sampfrom=samp_from, sampto=samp_to, shift_samps=True)
print(f'samples: {atr_1.sample}')
# IndexError
# print(f'values at samples: {values[atr_1.sample, :]}')

samp_to_minus_1 = samp_to - 1
atr_2 = wfdb.rdann('test', extension='atr', sampfrom=samp_from, sampto=samp_to_minus_1, shift_samps=True)
print(f'samples: {atr_2.sample}')
# No IndexError
# print(f'values at samples: {values[atr_2.sample, :]}')
kevindlau commented 7 months ago

Should the 2nd condition be < rather than <=?

https://github.com/MIT-LCP/wfdb-python/blob/8d813599965a8a8eb4b587e2b92e940b2f487dc0/wfdb/io/annotation.py#L242

bemoody commented 7 months ago

I agree, I think rdann should be consistent with rdsamp and I think sampfrom/sampto should define a half-open interval.

This would be an API-breaking change, though. I've kind of wanted to change that API in any case, perhaps renaming the args to from/to or start/end or something, so maybe we should do that at the same time.