Open ChihchengHsieh opened 3 years ago
Hello,
I just have a quick question about the MinMaxScaler and dataset splitting process.
Is scaler alloed to see the test set?
Since you fit the scaler before splitting, the fitting process include the test set.
However, I found this Question 1 (b):
https://cs230.stanford.edu/files/cs230exam_fall18_soln.pdf
And this:
https://jamesmccaffrey.wordpress.com/2019/01/04/how-to-normalize-training-and-test-data-for-machine-learning/?fbclid=IwAR01dAH5OIWkiS8SeJ-XDQ5vyBKoGRe1CwJ9J0HiWexb9zBV1xuOWmw_YjU
Indicating that scaler should fit on the training set.
Or both of them are fine?
Thanks in advance
It can be a data leakage problem:
https://machinelearningmastery.com/data-leakage-machine-learning/
Hello,
I just have a quick question about the MinMaxScaler and dataset splitting process.
Is scaler alloed to see the test set?
Since you fit the scaler before splitting, the fitting process include the test set.
However, I found this Question 1 (b):
https://cs230.stanford.edu/files/cs230exam_fall18_soln.pdf
And this:
https://jamesmccaffrey.wordpress.com/2019/01/04/how-to-normalize-training-and-test-data-for-machine-learning/?fbclid=IwAR01dAH5OIWkiS8SeJ-XDQ5vyBKoGRe1CwJ9J0HiWexb9zBV1xuOWmw_YjU
Indicating that scaler should fit on the training set.
Or both of them are fine?
Thanks in advance