Open KyrMitsos opened 3 years ago
Was mydata
a tibble?
Hi Max,
No it is a data.frame
Ok. I'm not sure what I can do without a reproducible example. Can you provide one (hopefully via reprex()
)?
OK. Here you are:
library(C50)
library(partykit)
#> Loading required package: grid
#> Loading required package: libcoin
#> Loading required package: mvtnorm
# BASIC DATA
#prep data
mydata <- df.kosher[,c(indices.outcome, 39, 40)]
#> Error in eval(expr, envir, enclos): object 'df.kosher' not found
sets <- getTrainAndTestSamples(mydata)
#> Error in getTrainAndTestSamples(mydata): could not find function "getTrainAndTestSamples"
trainset <- sets$train
#> Error in eval(expr, envir, enclos): object 'sets' not found
testset <- sets$test
#> Error in eval(expr, envir, enclos): object 'sets' not found
mydata$RES_HALF <- as.factor(mydata$RES_HALF)
#> Error in is.factor(x): object 'mydata' not found
mydata$RES_FINAL <- as.factor(mydata$RES_FINAL)
#> Error in is.factor(x): object 'mydata' not found
# fit model
# fit <- C5.0(RES_FINAL~., data=mydata, trials=1)
fit <- C5.0(y=mydata$RES_FINAL, x=mydata[,-5], trials=1)
#> Error in C5.0(y = mydata$RES_FINAL, x = mydata[, -5], trials = 1): object 'mydata' not found
# summarize the fit
print(fit)
#> Error in print(fit): object 'fit' not found
# make predictions
predictions <- predict(fit, mydata[,-5])
#> Error in predict(fit, mydata[, -5]): object 'fit' not found
# summarize accuracy
confusionMatrix(as.factor(predictions), as.factor(mydata$RES_FINAL))
#> Error in confusionMatrix(as.factor(predictions), as.factor(mydata$RES_FINAL)): could not find function "confusionMatrix"
plot(fit)
#> Error in plot(fit): object 'fit' not found
Created on 2021-05-08 by the reprex package (v2.0.0)
As for 'mydata'
'data.frame': 86935 obs. of 5 variables: $ A : num 1.5 2.4 3 3.45 1.57 5.5 2.05 2.05 2 3.65 ... $ B : num 3.75 3.3 3 3.45 3.65 4.35 3.35 3.4 3.5 3.55 ... $ C : num 4.55 2.3 2.05 1.75 4.25 1.35 2.75 2.75 2.75 1.7 ... $ RES_HALF : Factor w/ 3 levels "A","B","C": 2 2 2 3 1 3 3 1 2 3 ... $ RES_FINAL: Factor w/ 3 levels "A","B","C": 3 2 3 3 1 2 3 1 1 2 ...
make sure you reproduce this kind of data.frame and use that. 5 variables three are nums and 2 are factors of 3 levels each. Make it random, it doesn't need to have any more structure or meaning than this. Thank you for attempting to fix this.
That's not really reproducible. I don't have the data associated with your issue.
Can you not simulate the same dataframe? Just create a random one. You need 3 numerics and 2 Factors of 3 (same levels) each.
I don't think the particular dataframe is at fault. I don't think it has some special properties. Otherwise, tell me and I can send you an .rd file with the workpace variables on my R Studio. Thanks.
I can't hunt for data sets that I know will create the same error that you encountered.
I personally believe this is not down to a specific dataset. It can happen with any but that have I guess the same characteristics.
If what you say is true, then aren't you motivated to get to the bottom of this bug? Isn't that your goal? To provide a bug free library to the people?
What can I do to help you resolve this bug?
I have the same problem. The model is fitted properly but I'm not able to plot it. If I try to plot it with the option rules="T" it produces this error:
Error: tree models only
And if I use rules="F" it says:
Error in FUN(X[[i]], ...) :
'list' object cannot be coerced to type 'double'
I'm using C5.0 v0.13.1 in R v4.02 on Windows 10.
Hi guys,
I have a dataframe of 4 predictors and one response variable. I am using your examples on C50 library and its very useful plot function however I get this message when I run plot on the learned model
I traced through the code and the culprit is function as.party() (line 121 in as.party.C5.0.R)
My data looks like this:
First four vars are the predictors and RES_B is the response.
When line 121 is called it calls lapply passing X and FUN with X = [1:13] and FUN = function (i) { valpred <- integer(0) vec <- strsplit(out[i], ":")[[1]] vec <- vec[vec != ""] varp <- as.vector(sapply(adj.pred, function(j) { ind <- grep(paste0(j, " "), vec) if (length(ind) == 0) return(-1) return(ind) })) if (!any(varp > 0)) { stop("Variable match was not found.") } valpred <- as.vector(which(varp > 0)) valpred <- valpred[which.max(nchar(adj.pred[valpred]))] a1 <- gsub(obj$pred[valpred], "", out[i]) if (n.cat[valpred]) { if (length(grep(" in \{", a1)) > 0) { vec <- a1 while (length(grep("^in", vec)) == 0) { vec <- sub("^.", "", vec) } a2 <- sub("in \{", "", vec) if (length(grep(":", a2)) > 0) { a2 <- strsplit(a2, "\}:") if (length(a2) > 2) { stop("The code currently does not work with factor levels or responses that have the symbol '}:' in them.") } } else { a2 <- sub("\}$", "", a2) } a2 <- a2[[1]][1] a1 <- sub(a2, "X", vec) a2 <- paste0("{", a2, "}", collapse = "") } else { vec <- a1 while (length(grep("^=", vec)) == 0) { vec <- sub("^.", "", vec) } a2 <- sub("^= ", "", vec) a2 <- strsplit(a2, ":") if (length(a2) > 2) { stop("The code currently does not work with factor levels or responses that have the symbol ':' in them.") } a2 <- a2[[1]][1] a1 <- sub(a2, "X", vec) } } a1 <- strsplit(a1, " ")[[1]] a1 <- gsub(":", "", a1) a1 <- gsub("\.\.\.", "", a1) a1 <- a1[a1 != ""] if (n.cat[valpred]) { a1[2] <- a2 } as.vector(c(adj.pred[valpred], a1)) }
This is as much digging I am willing to perform at this point. Unfortunately I have been unable to find a parallel to this problem before. I hope you can reproduce the issue.
Thank you in advance for your efforts!