Chapter 2: Value differences in prediction

Hi all, I started doing the ml project in chapter 2 in these 1-2 months.

I checked my code for serveral times and they are more or less the same as this repo.

I get the following results

Predictions: [ 85657.90192014 305492.60737488 152056.46122456 186095.70946094  244550.67966089]
Labels: [72100.0, 279600.0, 82700.0, 112500.0, 238300.0]

when I run the below code.

from sklearn.linear_model import LinearRegression 

lin_reg = LinearRegression()
lin_reg.fit(housing_prepared, housing_labels) # fit the prepared data and the corresponding labels

some_data = housing.iloc[:5] 
some_labels = housing_labels.iloc[:5] 
some_data_prepared = full_pipeline.transform(some_data) 
print("Predictions:", lin_reg.predict(some_data_prepared)) # Predictions: [ 210644.6045  317768.8069  210956.4333  59218.9888  189747.5584] 
print("Labels:", list(some_labels)) # Labels: [286600.0, 340600.0, 196900.0, 46300.0, 254500.0]

Is it normal for the same set of data to get such a big difference in prediction result ? Or is there anything possible mistake i've made to get this happened? Thanks for the help.

ageron / handson-ml

Chapter 2: Value differences in prediction #679