Data normalization - Githubissues

ziplab / LITv2

[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "Fast Vision Transformers with HiLo Attention"

Apache License 2.0

229 stars 11 forks source link

Closed ytring closed 1 year ago

ytring commented 1 year ago

Hi,

I saw that data is normalized for the val split, but not for the train split:

Shouldn't normalization be applied to both val and train splits?

Thank you for your help again.

HubHop commented 1 year ago

That's a good question. In general, we have different aims for training and testing.

During training, we aim to train a model to learn from the input data and learn the underlying patterns and representations within the data distribution. If we normalize the train set, we are artificially altering the data distribution and may introduce biases into the model's training process. This is unwanted.
During testing, we aim to evaluate the models'performance. It is important to normalize the validation set to ensure consistency and fair evaluation.

The following discussions may help,

ytring commented 1 year ago

Thank you!