Open dandriopoulos opened 6 years ago
We attempted to implement the XGBoost algorithm to model imdb ratings. Following is the code that I used:
imdb_data_xg<-read.csv(file.choose()) imdb_data_xg_label<-read.csv(file.choose())
imdb_data_xg=imdb_data_xg[1:6]
imdbXG=as.matrix(imdb_data_xg) imdbLABEL=as.matrix(imdb_data_xg_label)
set.seed(77850) #set a random number generation seed to ensure that the split is the same everytime
inTrain3 <- createDataPartition(y = imdb_data_xg$Revenue_MM,p = 0.9, list = FALSE) training3 <- imdbXG[ inTrain3,] testing3 <- imdbXG[ -inTrain3,] str(training3)
dtrain <- xgb.DMatrix(data = training3, label = imdbLABEL)
bst <- xgboost(data = training3, label = imdbLABEL, max.depth = 10, eta = 1, nthread = 2, nround = 10, objective = "binary:logistic")
The error that I get is: dtrain <- xgb.DMatrix(data = training3, label = imdbLABEL) Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data
Any input?
We attempted to implement the XGBoost algorithm to model imdb ratings. Following is the code that I used:
imdb_data_xg<-read.csv(file.choose()) imdb_data_xg_label<-read.csv(file.choose())
imdb_data_xg=imdb_data_xg[1:6]
imdbXG=as.matrix(imdb_data_xg) imdbLABEL=as.matrix(imdb_data_xg_label)
set.seed(77850) #set a random number generation seed to ensure that the split is the same everytime
inTrain3 <- createDataPartition(y = imdb_data_xg$Revenue_MM,p = 0.9, list = FALSE) training3 <- imdbXG[ inTrain3,] testing3 <- imdbXG[ -inTrain3,] str(training3)
dtrain <- xgb.DMatrix(data = training3, label = imdbLABEL)
bst <- xgboost(data = training3, label = imdbLABEL, max.depth = 10, eta = 1, nthread = 2, nround = 10, objective = "binary:logistic")
The error that I get is: dtrain <- xgb.DMatrix(data = training3, label = imdbLABEL) Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) : The length of labels must equal to the number of rows in the input data
Any input?