Open DarioS opened 2 years ago
For comparison, mlr3 blocks the user from being able to perform classification unless the names are syntactically valid.
> library(mlr3)
> colnames(iris) <- gsub("\\.", ' ', colnames(iris))
> head(iris)
Sepal Length Sepal Width Petal Length Petal Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
> task <- as_task_classif(iris, "Species", id = "irises")
Error in .__Task__initialize(self = self, private = private, super = super, :
Assertion on 'column names' failed: Must have names according to R's variable naming conventions, but element 1 does not comply.
It is a take-no-prisoners approach.
I would like to see a paragraph in the book about how weird variable names should be handled. Perhaps in Chapter 18. For example,
In a real bioinformatics data set, I have names like
Cer(d16:1/20:0)
andSph(d18:2)
.