Closed kirtov closed 3 years ago
Hi @kirtov
all wfdb files stored in data/ptbxl/records100/
and data/ptbxl/records500/
are stored as raw millivolts i.e. without any preprocessing. Please note that wfdb.rdsamp
already applies adc (analog to digital converter with 16 bit resolution with 1μV/LSB i.e. 1000 A/D units per mV). In addition I think a mean of 0.0 mV and std around 0.1-0.2 mV is expected. What are the units and statistics for Apple Watch? Are they also 12-lead? I really don't know. Nevertheless to circumvent scaling issues it is recommended to always standardize your data sources to mean 0 and std 1.
Please note: Our provided methods for loading raw data use wfdb.rdsamp
and pickles numpy arrays as floats. If you want to save space, you could undo adc by calling r = wfdb.rdrecord(path)
and store (r.p_signal*r.adc_gain).astype(np.int16)
as 16 bit arrays.
I hope this answers you question. Since I'm not aware of a existing issue here, I will close this issue for now.
Hello, thank you for the great work, it is very useful!
I am trying to figure out about ECG data preprocessing. Looking at "raw" PTB-XL dataset I see, that the mean value of ECGs are near 0.0 and std are 0.1 - 0.2, so it differs from e.g. ECG by Apple Watch (it amplitude is much greater than in PTB-XL), so I think than PTB-XL ECG was normalized somehow. So, can you, please, clarify about PTB-XL data preprocessing?