caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
1.61k
stars
634
forks
source link
Compatibility of write_csv/read_csv vs write.csv/read.csv and perhaps broader tidyverse compatibility with caret learning_curve_dat #1333
Open
BenJCQuah opened 1 year ago
Hi, Thanks for your great caret package.
I am new to ML code and R in general. I would like to use a learning curve in some of my pipeline using learning_curve_dat().
I am finding if I use tidyverse code to pipe or manipulate data that learning_curve_dat() seems to fail.
To give an example please see the options of write_csv/read_csv vs write.csv/read.csv below
the first code section doesn't work (at least for me) using write_csv/read_csv
but the second identical code (apart from the read and write) does when I use write.csv/read.csv
I am also finding if I use other tidyverse code it also seems to fail (please let me know if additional examples are required)
Is this a known issue? Or am I doing something else wrong.
Thanks!
Ben
library(caret) library(pander) library(pastecs) library(catboost) library(randomForest) library(dplyr) library(tidyverse)
USING write_csv/read_csv
set.seed(1412) class_dat <- twoClassSim(1000)
write_csv(class_dat, "class_dat.csv") class_data <- read_csv("class_dat.csv") class_data$Class <- factor(as.character(class_data$Class)) levels(class_data$Class)
sapply(class_data, class)
set.seed(29510) rf_data <- learning_curve_dat(dat = class_data, outcome = "Class", test_prop = 1/4,
train
arguments)
This is the error I get
Error in createDataPartition(dat[, outcome], p = 1 - test_prop, list = FALSE) : y must have at least 2 data points
USING write.csv/read.csv
set.seed(1412) class_dat <- twoClassSim(1000)
write.csv(class_dat, "class_dat.csv") class_data <- read.csv("class_dat.csv")
class_data$Class <- factor(as.character(class_data$Class)) levels(class_data$Class)
sapply(class_data, class)
set.seed(510) rf_data <- learning_curve_dat(dat = class_data, outcome = "Class", test_prop = 1/4,
train
arguments)
RUNS FINE