Preprocess method has an "omit.na" parameter that should take care of NULLs, but we have to run dropna to the output to get the code to work properly.
# omit.na should take care of na processing....
df_trans <- r4ml.ml.preprocess(
df, transformPath="/tmp",
recodeAttrs=c("UniqueCarrier", "TailNum", "Origin", "Dest"),
omit.na=c("UniqueCarrier", "TailNum", "Origin", "Dest"))
# sample the dataset into the train and test
samples <- r4ml.sample(df_trans$data, perc=c(0.7, 0.3),seed=0)
train <- samples[[1]]
ignore <- cache(train)
test <- samples[[2]]
ignore <- cache(test)
# train the glm model by default it is the binomial
train_m <- as.r4ml.matrix(train)
## dropna should not be necessary here!
glm <- r4ml.glm(DepTime ~ .,as.r4ml.matrix(dropna(train_m)))
Preprocess method has an "omit.na" parameter that should take care of NULLs, but we have to run dropna to the output to get the code to work properly.