cavalab / srbench

A living benchmark framework for symbolic regression
https://cavalab.org/srbench/
GNU General Public License v3.0
216 stars 77 forks source link

missing standardization of X_test #19

Closed folivetti closed 3 years ago

folivetti commented 3 years ago

Only X_train was being standardized in evaluate_model.py.

I've just created a variable named X_test_std that was created by:

sc_x = StandardScaler()
X_train = sc_x.fit_transform(X_train)
X_test = sc_x.transform(X_test)

X_test should be transformed using the same scaling as the training data or it will lead to incorrect predictions. It is important that the scaling is done using information only from the training set to avoid contamination that could lead to optimistic scores.

lacava commented 3 years ago

thanks @folivetti, I actually fixed this last week and just pushed the commit (dfec2511daff6091dd011d5c0c43141a0dfa9895)