jakevdp / PythonDataScienceHandbook

Python Data Science Handbook: full text in Jupyter Notebooks
http://jakevdp.github.io/PythonDataScienceHandbook
MIT License
43.29k stars 17.95k forks source link

5.3.4 grid.fit(X,y) ;A ValueErrorValueError error occurred #367

Open YoungBooker opened 1 year ago

YoungBooker commented 1 year ago

`from sklearn.model_selection import GridSearchCV param_grid = {'polynomialfeaturesdegree': np.arange(21), 'linearregressionfit_intercept': [True, False], 'linearregression__normalize': [True, False]}

grid = GridSearchCV(PolynomialRegression(), param_grid, cv=7) grid.fit(X,y)`

Sorry to bother you. I had a small issue in section 5.3.4 of this book. Jupyter prompts me with the following error。 I also ran the code you gave, but this kind of problem also occurs. (I ran the code for 5.3 through until grid.fit(X,y)) My English is not very good, sorry again.

ValueError: Invalid parameter 'normalize' for estimator LinearRegression(). Valid parameters are: ['copy_X', 'fit_intercept', 'n_jobs', 'positive'].

YoungBooker commented 1 year ago

`from sklearn.model_selection import GridSearchCV from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.pipeline import make_pipeline import matplotlib.pyplot as plt import numpy as np import seaborn

def PolynomialRegression(degree=2,**kwargs):

建立多项式回归模型

return make_pipeline(PolynomialFeatures(degree),
                    LinearRegression(**kwargs))

def make_data(N,err=1.0,rseed=1):

随机抽样数据

  rng=np.random.RandomState(rseed)
  X=rng.rand(N,1)**2
  y=10-1./(X.ravel()+0.1)
  if err > 0:
      y+=err*rng.randn(N)
  return X,y

X,y=make_data(40) param_grid = {'polynomialfeaturesdegree': np.arange(21), 'linearregressionfit_intercept': [True, False]}

网格搜索

grid = GridSearchCV(PolynomialRegression(), param_grid, cv=7)

调用方法fit(),并同时记录每个点的得分

grid.fit(X,y)

打印最优参数

print(grid.bestparams)

设置图样格式

seaborn.set()

用最优参数的模型拟合数据

model=grid.bestestimator

画出随机数据的散点图

plt.scatter(X.ravel(),y)

lim=plt.axis() X_test=np.linspace(-0.1,1.1,500)[:,None]

predict是训练后返回预测结果,是标签值

y_test=model.fit(X,y).predict(X_test) plt.plot(X_test.ravel(),y_test) plt.axis(lim)`