dataprofessor / rshiny_freecodecamp

48 stars 45 forks source link

Play-Golf Error #2

Open tyler-mcb opened 2 years ago

tyler-mcb commented 2 years ago

It seems that the categories, outlook and play have been converted into chr variables instead of the factors seen in your data. This has caused an error that I cannot solve.

model <- randomForest(play~ ., data = weather2, ntree = 500, mtry = 4, importance = TRUE) Error in y - ymean : non-numeric argument to binary operator In addition: Warning messages: 1: In randomForest.default(m, y, ...) : The response has five or fewer unique values. Are you sure you want to do regression? 2: In mean.default(y) : argument is not numeric or logical: returning NA

immenseforest commented 1 year ago

@tyler-mcb @dataprofessor

please check pull request #3 :

read.csv no longer automatically converts characters to factors, and this produces a warning message:

"Warning in randomForest.default(m, y, ...) : The response has five or fewer unique values. Are you sure you want to do regression? Warning in mean.default(y) : argument is not numeric or logical: returning NA Error in y - ymean : non-numeric argument to binary operator"

By adding in stringAsFactors = TRUE to the read.csv function, the code will simulate the legacy behaviour of auto-converting characters to factors and the app will be loaded properly

here is an article (https://blog.r-project.org/2020/02/16/stringsasfactors/) by Kurt Hornik talking about the reasoning behind R's decision of altering this default behaviour

Sourabhmehta-AIM commented 1 year ago

@tyler-mcb You could use a label encoder such as an ordinal encoder. Refer to the below code. The encode_ordinal is a custom variable which holds the function, and the variable x holds the logic for converting char to factor.

encode_ordinal <- function(x, order = unique(x)) {
  x <- as.numeric(factor(x, levels = order, exclude = NULL))
  x
}
weather[["play"]] <- encode_ordinal(weather[["play"]])
weather[["play"]] <- as.factor(weather[["play"]])