Open wangbaili opened 2 years ago
Hi, could you provide us with the column names of your es5
dataset?
What would help even more would be a minimal reproducible code example that we can actually run, i.e. including all the data that is being used.
My assumption is that the "encode"
PipeOp
creates a column that is named ..row_id
, which confuses mlr3 since it is in some way a reserved column name.
sorry for long time no reply,I have to solve some health problem. The data is confidential,but I got same encode problem at this dataaa1.xlsx. This data is all factor except (event="status",time="time") Thanks again as I await your suggestion
This the code
aa <- read_excel("C:/Users/LENOVO/Desktop/aa/aa1.xlsx") names(aa)
aa[,3:13]<-lapply(aa[,3:13],function(x)as.factor(as.character(x))) taskwork<-TaskSurv$new("taskwork",aa, time = "time", event = "status") learners <- lrns(paste0("surv.", c("coxtime", "deephit", "deepsurv", "loghaz", "pchazard")), frac = 0.3, early_stopping = TRUE, epochs = 10, optimizer = "adam" ) create_pipeops <- function(learner) { po("encode",method = "treatment") %>>% po("learner", learner) } learners <- lapply(learners, create_pipeops)
resampling <- rsmp("bootstrap", ratio=0.6,repeats=10) design <- benchmark_grid(taskwork,learners , resampling) bm <- benchmark(design)
This is the error:
Error in as_data_backend.data.frame(data, primary_key = row_ids) : Assertion on 'primary_key' failed: Contains duplicated values, position 2. This happened PipeOp encode's $train()
Thanks! Apparently the problem is that bootstrapping uses some rows repeatedly, which somehow breaks with mlr3's assumption that row_ids are unique values.
Minimal example:
library("mlr3")
library("mlr3pipelines")
options(mlr3.debug=TRUE)
resample(tsk("iris"), po("pca") %>>% lrn("classif.featureless"), rsmp("bootstrap"))
I will try to take care of this soon, until then a workaround would be to use a different resampling method (e.g. rsmp("cv")
instead of rsmp("bootstrap")
).
Hello!
Thank you again for the R implementation of the mlr3.
I want to po the encode scale to the survival models(deepsurv),but have some trouble. this my codes
when i ran bm ,get this error:
I dont undertand this
Thanks again as I await your suggestion