Open webzest opened 4 years ago
You should remove the column names from the test matrix (dtest
) by running
colnames(dtest) <- NULL
Hi, Thank you for the response. I followed it, but something strange happened. All the predictions are the same:
[1] 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415
## [9] 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415
## [17] 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415
## [25] 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415 10.41415
Hi, I have an update: I fixed the colnames issue by updating this block of code. I also corrected it above...
x_train <<- processed[1:1460,]
x_train <<- cbind(x_train, SalePrice = y_train)
#print(names(x_train))
#print("Updated x_test column names to match x_train")
x_test <<- processed[1461:2919,]
x_test <<- cbind(x_test, SalePrice = 0)
#print(names(x_test))
Nevertheless, the prediction of the unknown data still generate the same value for all label values.
### Run Predictions
```{r}
XGBpred <- expm1(predict(xgb_mod, dtrain))
print(head(XGBpred, 5))
Results: [1] 184258.9 160477.3 196703.9 124443.1 219499.4
XGBactual <- expm1(predict(xgb_mod, dtest))
print(head(XGBactual, 5))
Results: [1] 33327.03 33327.03 33327.03 33327.03 33327.03
Adding -1
to the model formula has helped to solve this problem for me, suggesting that something about the Intercept or the perceived need for a column reflecting one in the data, may be contributing to the problem.
I am trying to build a model to predict housing prices in R R version 4.0.2 (2020-06-22), with the latest updates. The code runs fine without errors until I tried to call the predict function on the unknown data to generate a prediction.
My data came from the Kaggle Housing Competition Web site [https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data](Kaggle Data Source)
My XGBoost section:
The xgb.train ran fine. If I use the same x_train data to predict, it works fine. However, it fails when I try to use the dtest data set:
Would you please advise on what I would need to do to proceed without errors?