Open rknimmakayala opened 5 years ago
Hi rknimmakayala, could you try rerunning with both models and model labels in a list, hence:
models = list("xgb"), model_labels = list("xgboost"),
If this does not help, please include some more info like head(scores_and_ntiles) or provide error message from aggregate_over_ntiles(scores_and_ntiles)
I ran with only 1 model now.
scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("seen","unseen"), dataset_labels = list("train data","test data"), models = "xgb", model_labels = "xgboost", target_column="y" )
plot_input <- plotting_scope(prepared_input = scores_and_ntiles, select_model_label = "xgboost", select_dataset_label = "test data")
head(scores_and_ntiles)
model_label dataset_label y_true prob_p0 prob_p1 ntl_p0 1 xgboost train data 0 0.3865758 0.6134242 9 2 xgboost train data 0 0.6179351 0.3820650 6 3 xgboost train data 1 0.5497741 0.4502260 7 4 xgboost train data 0 0.2749349 0.7250651 10 5 xgboost train data 0 0.1463084 0.8536916 10 6 xgboost train data 0 0.5573502 0.4426498 7 ntl_p1 1 2 2 5 3 4 4 1 5 1 6 4
The p in ntl_p0 and ntl_p1 is unexpected. Some extra checks to pinpoint the issue:
head(seen) head(unseen) unique(seen$y) levels(seen$y) unique(unseen$y) levels(unseen$y)
Hi rknimmakayala, could you provide the details from extra checks above? What version of modelplotr are you using and what model wrapper - caret?
Apologies for the delayed response. Here it is the dataset has 128 variables and so I am avoiding head(seen & unseen) at the moment Modelplotr version 1.0.0 and model wrapper is h2o and the version of it is 3.26.0.6 levels(seen$y) & levels (unseen$y) is "0" & "1"
ok, thanks for the extra info. The issue is that when using h2o with a numeric target variable, h2o puts a 'p' in front of the value in the prediction result whereas this is not done when the target variable is a character.
(for details: run this colnames(h2o.predict(xgb,train))[, -1]))
We'll cover this in modelplotr as soon as possible, for now you can circumvent this issue by adding an X in front of the numeric value, eg.:
train <- train %>% mutate(y=as.factor(paste0('X',y)))
Hope this works for you, thanks for sharing your issue.
hello I am, running this code and I see the error shown above
library(modelplotr)
scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("seen","unseen"), dataset_labels = list("train data","test data"), models = "xgb", model_labels = "xgboost", target_column="y" )
plot_input <- plotting_scope(prepared_input = scores_and_ntiles, select_model_label = "xgboost", select_dataset_label = "test data")