이것이 데이터 분석이다 265p에서 똑같이 쳤는데 오류가 자꾸 나네요ㅠㅠ

오류 -> KeyError: "['month', 'phone_model', 'maker'] not in index"

따라친 코드는 아래와 같아요. 어떻게 해결해야 할까요.... (그 전 코드 까지는 이상 없어요)

from sklearn.model_selection import train_test_split from sklearn.feature_extraction import DictVectorizer from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import r2_score from sklearn.metrics import mean_squared_error

데이터를 학습/테스트용 데이터로 분리

df = df[['price', 'phone_model', 'factory_price', 'maker', 'price_index', 'month']] df = pd.get_dummies(df, columns=['phone_model', 'maker', 'month']) X = df.loc[:, df.columns != 'price'] y = df['price'] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

랜덤 포레스트 모델을 학습

forest = RandomForestRegressor(n_estimators=1000, criterion='mse') forest.fit(X_train, y_train) y_train_pred = forest.predict(X_train) y_test_pred = forest.predict(X_test)

학습한 모델을 평가

print('MSE train: %.3f, test: %.3f' % (mean_squared_error(y_train, y_train_pred), mean_squared_error(y,test, y_test_pred))) print('R^2 train: %.3f, test: %.3f' % (r2_score(y_train, y_train_pred), r2_score(y_test, y_test_pred)))

yoonkt200 / python-data-analysis

질문있어요 기태님~~ #2

데이터를 학습/테스트용 데이터로 분리

랜덤 포레스트 모델을 학습

학습한 모델을 평가