Issue in the data normalization to unit energy

Hi there!

I've realized that in the dataset generation, the energy of the 128-sample data vectors is being normalized to unity as follows (lines 65 and 66):

energyNormalization

However, to the best of my knowledge, the energy Es of a discrete-time signal x(n) is defined mathematically as:

energyEq

Once you have calculated the Es, the sampled_vector must be divided by the square root of the energy, not only by the energy itself. In code, it should be something like this:

energy = np.sum(np.abs(sampled_vector) ** 2) sampled_vector = sampled_vector / math.sqrt(energy)

I've plotted both versions and these are the results.

Before:

beforeCorrection

After:

afterCorrection

Therefore the signals are being unnecessarily compressed, which can make it harder for (some) models to extract meaningful information, or even prevent it altogether.

Do my findings make sense to you or is there anything that I may have not understood properly? Please, check it and update us with your conclusions if you are so kind.

I look forward to hearing from you.

Regards,

Ramiro Utrilla

energy = np.sum(np.abs(sampled_vector) ** 2) sampled_vector = sampled_vector / math.sqrt(energy) max_val = max(max(np.abs(sampled_vector.real)), max(np.abs(sampled_vector.imag))) sampled_vector = sampled_vector / max_val

radioML / dataset

Issue in the data normalization to unit energy #24