tensorflow / skflow

Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning
Apache License 2.0
3.18k stars 441 forks source link

fit_transform on test set in DNN Regression example boston.py #157

Closed ReaBx closed 8 years ago

ReaBx commented 8 years ago

In the DNN Regression example, when scoring the prediction on the test set, should the scaling not be done only with scaler.transform(X_test) instead of scaler.fit_transform(X_test)? I'm completely new to both sklearn and skflow and am just trying to understand this example. I understand why I scale the training set to zero mean and unit std dev, but wouldn't we want to rate the test set to this same mean? scaler.fit_transform(X_test) again scales X_test to zero mean and unit std dev, right? But I do want to set it in comparison to X_train, don't I?

Also, when I change score = metrics.mean_squared_error(regressor.predict(scaler.fit_transform(X_test)), y_test) to score = metrics.mean_squared_error(regressor.predict(scaler.transform(X_test)), y_test) the MSE is roughly halved.

Mawox commented 8 years ago

I would say there is no right answer here. Scaling is just a linear transformation. You can scale your X in any way you want. If transform()works better here, then go for it. In my experience fit_transform() usually works better, but it depends on the dataset.

ReaBx commented 8 years ago

Hm, but don't I have to scale X_test the same way I scaled X_train? I mean taking this example of house prices in Boston, when I look at the test set, don't I want to set it in relation to the training set? Why would I transform the test set to zero mean? What if by chance I have mostly larger houses in the test set? Thanks!

terrytangyuan commented 8 years ago

It depends highly on your application. There's also batch normalization that you can look into and may suit your particular needs.

ilblackdragon commented 8 years ago

Fit transform on evaluation data is very subtle issue. If you do it, you indeed will get a better results. But it also will hide if you validation sample is different from your training. At inference time you don't have a sample wide mean/STD deviation which means you will use one from training data. And if you picked your model based on evaluation results that are skewed by scaling on it's own mean/stddev, you may use non-optimal model.

I would highly recommend to use the same setup for evaluation and inference. And think of inference time as when it's launched in some service. Then you can rely on your validation/evaluation results to be representative. On Mar 29, 2016 10:35 AM, "Yuan (Terry) Tang" notifications@github.com wrote:

It depends highly on your application. There's also batch normalization that you can look into and may suit your particular needs.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/tensorflow/skflow/issues/157#issuecomment-203016530