jurrr / modelplotr

modelplotr
16 stars 6 forks source link

Evaluation error: object 'ntl_0' not found #20

Open rknimmakayala opened 5 years ago

rknimmakayala commented 5 years ago

hello I am, running this code and I see the error shown above

library(modelplotr)

scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("seen","unseen"), dataset_labels = list("train data","test data"), models = "xgb", model_labels = "xgboost", target_column="y" )

plot_input <- plotting_scope(prepared_input = scores_and_ntiles, select_model_label = "xgboost", select_dataset_label = "test data")

jurrr commented 5 years ago

Hi rknimmakayala, could you try rerunning with both models and model labels in a list, hence:

models = list("xgb"), model_labels = list("xgboost"),

If this does not help, please include some more info like head(scores_and_ntiles) or provide error message from aggregate_over_ntiles(scores_and_ntiles)

rknimmakayala commented 5 years ago

I ran with only 1 model now.

scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("seen","unseen"), dataset_labels = list("train data","test data"), models = "xgb", model_labels = "xgboost", target_column="y" )

plot_input <- plotting_scope(prepared_input = scores_and_ntiles, select_model_label = "xgboost", select_dataset_label = "test data")

head(scores_and_ntiles)

model_label dataset_label y_true prob_p0 prob_p1 ntl_p0 1 xgboost train data 0 0.3865758 0.6134242 9 2 xgboost train data 0 0.6179351 0.3820650 6 3 xgboost train data 1 0.5497741 0.4502260 7 4 xgboost train data 0 0.2749349 0.7250651 10 5 xgboost train data 0 0.1463084 0.8536916 10 6 xgboost train data 0 0.5573502 0.4426498 7 ntl_p1 1 2 2 5 3 4 4 1 5 1 6 4

jurrr commented 5 years ago

The p in ntl_p0 and ntl_p1 is unexpected. Some extra checks to pinpoint the issue:

head(seen) head(unseen) unique(seen$y) levels(seen$y) unique(unseen$y) levels(unseen$y)

jurrr commented 5 years ago

Hi rknimmakayala, could you provide the details from extra checks above? What version of modelplotr are you using and what model wrapper - caret?

rknimmakayala commented 5 years ago

Apologies for the delayed response. Here it is the dataset has 128 variables and so I am avoiding head(seen & unseen) at the moment Modelplotr version 1.0.0 and model wrapper is h2o and the version of it is 3.26.0.6 levels(seen$y) & levels (unseen$y) is "0" & "1"

jurrr commented 5 years ago

ok, thanks for the extra info. The issue is that when using h2o with a numeric target variable, h2o puts a 'p' in front of the value in the prediction result whereas this is not done when the target variable is a character.

(for details: run this colnames(h2o.predict(xgb,train))[, -1]))

We'll cover this in modelplotr as soon as possible, for now you can circumvent this issue by adding an X in front of the numeric value, eg.:

train <- train %>% mutate(y=as.factor(paste0('X',y)))

Hope this works for you, thanks for sharing your issue.