dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.14k stars 8.71k forks source link

Predict xgboost model onto raster stack yields error #8426

Open bappa10085 opened 1 year ago

bappa10085 commented 1 year ago

I am trying to use xgboost model to predict for raster data. Here is an minimal, reproducible, self-contained example

library(xgboost)
library(raster)

# create a RasterStack or RasterBrick with with a set of predictor layers
logo <- brick(system.file("external/rlogo.grd", package="raster"))
names(logo)

# known presence and absence points
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
              66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
              22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
              99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
              37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)

# extract values for points
xy <- rbind(cbind(1, p), cbind(0, a))
v <- data.frame(cbind(pa=xy[,1], extract(logo, xy[,2:3])))

xgb <- xgboost(data = data.matrix(subset(v, select = -c(pa))), label = v$pa, 
               nrounds = 5)

raster::predict(model = xgb, logo)

It returns the following error

Error in xgb.DMatrix(newdata, missing = missing) : xgb.DMatrix does not support construction from list

How to get rid of this error?

trivialfis commented 1 year ago

I'm not familiar with raster. But any chance this input data can be converted to other structures like matrix or data.table?

bappa10085 commented 1 year ago

The developer of raster or terra package has provided the following solution

library(terra)
library(xgboost)    
logo <- rast(system.file("ex/logo.tif", package="terra"))   
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85,
              66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31,
              22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)
a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
              99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
              37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)    
xy <- rbind(cbind(1, p), cbind(0, a))
v <- extract(logo, xy[,2:3])
xgb <- xgboost(data = data.matrix(v), label=xy[,1], nrounds = 5)

Now we can to write a prediction function that first coerces the data.frame with "new data" to a matrix. We can use that function with predict<SpatRaster>

xgbpred <- function(model, data, ...) {
    predict(model, newdata=as.matrix(data), ...)
}

p <- predict(logo, model=xgb, fun=xgbpred)
plot(p)