Data should be normalized per file, not per crypto name (as these occur multiple times, multi-exchange data)

dorienh / jesse

4 stars 0 forks source link

Data should be normalized per file, not per crypto name (as these occur multiple times, multi-exchange data) #14

Open dorienh opened 3 years ago

dorienh commented 3 years ago

#todo this can create issues, because actually pairs occur multiple times but should not be normalized the same! Just normalize for each file, not per crypto pair
        self.cryptos = [os.path.split(file)[1].split("_")[1] for file in files]

dorienh commented 3 years ago

Possibly this can be fixed by just using the filename here, if this whole approach of normalizing is ok!

dorienh commented 3 years ago

In addition, I could be wrong, but since add_crypto_id is set to False (it indicates if the ID is fed to the network, not sure normalization is done correctly.

It should take the scaler form the training part of the file, apply that same scaler, on the test set of that same file.

Can you double check? Thanks

Luckygyana commented 3 years ago

I think Normalisation was used per file, I will check once

dorienh commented 3 years ago

Because there is data for multiple exchanges so, the ticker pairs occur multiple times. exchange name should be as part of filename.