DeepPSP / torch_ecg

Deep learning ECG models implemented using PyTorch
MIT License
180 stars 20 forks source link

MinMaxNormalization Not Resulting in Expected 0-1 Range #12

Open rdyan0053 opened 1 year ago

rdyan0053 commented 1 year ago

I tested MinMaxNormalize with the following code:

from torch_ecg._preprocessors import MinMaxNormalize
minmax = MinMaxNormalize()
sig = raw_sig[:, 0]  # lead1, shape: (5000,)
sig_minmax, fs_minmax = minmax(sig, fs=500)
# plot
plt.figure(figsize=(20, 10))
plt.subplot(211)
plt.plot(sig)
plt.subplot(212)
plt.plot(sig_minmax)
plt.show()

image To my surprise, the results from MinMaxNormalize weren't as expected within the 0-1 range. Instead, all values turned out to be 0 (In the above figure you can see the second curve). I would like to understand why this happened.

For further verification, I tested using the code below:

sig = raw_sig[:, 0]  # lead1, shape: (5000,)
# use the min-max normalization to normalize the signal to [0, 1]
sig_minmax = (sig - np.min(sig)) / (np.max(sig) - np.min(sig))
# plot
plt.figure(figsize=(20, 10))
plt.subplot(211)
plt.plot(sig)
plt.subplot(212)
plt.plot(sig_minmax)
plt.show()

image I found that my own code was effective and could correctly normalize the signal values to the [0, 1] range.

I hope to receive your feedback and clarification on this matter. Thank you!

wenh06 commented 1 year ago

OK, I will check this soon.

wenh06 commented 1 year ago

I tested it with the recording HR06004 from sample-data/cinc2021, and it worked normally:

image

Could you please provide me your raw_sig?

rdyan0053 commented 1 year ago

Thanks. This is too strange. I'm not using sample-data but my own data. The range of my data is -400 to 200. After adding this line sig = sig/1000 before the MinMaxNormalize, this issue was resolved.

The modified code:

from torch_ecg._preprocessors import MinMaxNormalize
minmax = MinMaxNormalize()
sig = raw_sig[:, 0]  # lead1, shape: (5000,)
sig = sig/1000
sig_minmax, fs_minmax = minmax(sig, fs=500)
# plot
plt.figure(figsize=(20, 10))
plt.subplot(211)
plt.plot(sig)
plt.subplot(212)
plt.plot(sig_minmax)
plt.show()
wenh06 commented 1 year ago

It might be caused by data overflow, in which case np.max(sig) - np.min(sig) produces something with a very large absolute value. However, your data range (-400 to 200) does not support this (not np.int8, and could not reach the limit of np.int16).

Dividing by 1000 converts the data type to np.float64, and then one probability would not encounter the problem of data overflow. Nevertheless, data type conversion should be made before doing arithmetic operations, and I will make corresponding changes to the code.