evalclass / precrec

An R library for accurate and fast calculations of Precision-Recall and ROC curves
https://evalclass.github.io/precrec
GNU General Public License v3.0
45 stars 5 forks source link

Error: assert_that: missing values present in assertion #11

Closed thaocad closed 3 years ago

thaocad commented 5 years ago

Hi, I was running into this error message trying to call the evalmod() function as following:

Error: assert_that: missing values present in assertion 12. stop("assert_that: missing values present in assertion", call. = FALSE) at assert-that.r#88 11. check_result(res) at assert-that.r#72 10. see_if(..., env = env, msg = msg) at assert-that.r#50 9. assertthat::assert_that(is.atomic(pb[["specificity"]]), is.vector(pb[["specificity"]]), is.numeric(pb[["specificity"]]), pb[["specificity"]][1] == 1, pb[["specificity"]][n] == 0) at pl4_calc_measures.R#106 8. .validate.pevals(s3obj) at etc_utils_validate_obj.R#4 7. .validate(s3obj) at pl4_calc_measures.R#47 6. calc_measures(cdat) at pl2_pipeline_main_rocprc.R#17 5. FUN(X[[i]], ...) 4. lapply(seq_along(mdat), plfunc) at pl2_pipeline_main_rocprc.R#20 3. .pl_main_rocprc(mdat, model_type, dataset_type, class_name_pf, calc_avg = calc_avg, cb_alpha = cb_alpha, raw_curves = raw_curves, x_bins = x_bins) at pl1_pipeline_main.R#21 2. pl_main(mdat, mode = new_mode, calc_avg = calc_avg, cb_alpha = cb_alpha, raw_curves = raw_curves, x_bins = x_bins, na_worst = new_na_worst, ties_method = new_ties_method, validate = FALSE) at main_evalmod.R#332 1. evalmod(scores = predsToPlot, labels = labelsToPlot)

Can you please help? Did I miss anything? Thank you very much. Thao

takayasaito commented 5 years ago

It seems like that one of the validation checks returns NA, but I cannot tell much about it without the actual score and label data. Is is possible for you to attach predsToPlot and labelsToPlot to your comment? I think the maximum file size you can attach is around 25MB. GitHub accepts .zip and .gz file types so that the size shouldn't be a problem.

pwwang commented 3 years ago

I had a very similar issue. It is not because of the NAs. It is because your labels (of one fold) are all of one level (i.e. TRUE or FALSE). It happened to me because one of the folds have all instances labeled as FALSE, so that it has this line in the error message:

11. assertthat::assert_that(is.atomic(pb[["sensitivity"]]), is.vector(pb[["sensitivity"]]), 
  .     is.numeric(pb[["sensitivity"]]), pb[["sensitivity"]][1] == 
  .         0, pb[["sensitivity"]][n] == 1)

It happens mostly when you have a small sample size and you do n-fold cross-validation.

I understand that specificity or 1-sensitivity could be a denominator in the metric calculations, but I think the message is misleading.

I guess maybe we could add a delta to the denominator and some warning messages for this situation, or separate that assert statement and give different messages for different checks.

takayasaito commented 3 years ago

OK. I have to admit it is a terrible error message. The main procedure returns NA values for specificity and sensitivity respectively when N = 0 and P = 0. NA values are considered as 'undefined' or missing values in R.

Specificity and sensitivity must be 'undefined' since we cannot evaluate the classifier with one of the classes. In other words, they cannot be an arbitrary real value, including 0 and 1, rather than NA when N = 0 or P = 0.

I have modified the validation part to output improved error messages in case a single class dataset is provided. The next version of precrec will be released on CRAN in the beginning of January next year after CRAN submission team gets back from their winter holiday.

> library(precrec)

> # N = 0
> samp_n0 <- create_sim_samples(1, 10, 0, "random")
> evalmod(mmdata(samp_n0$scores, samp_n0$labels))

 Error: Curves cannot be calculated. Only a single class (positive) found in dataset (modname: m1, dsid: 1). 

> # P = 0
> samp_p0 <- create_sim_samples(1, 0, 10, "random")
> evalmod(mmdata(samp_p0$scores, samp_p0$labels))

  Error: Curves cannot be calculated. Only a single class (negative) found in dataset (modname: m1, dsid: 1). 
pwwang commented 3 years ago

Makes sense. Thank you!

takayasaito commented 3 years ago

Precrec v0.12 is on CRAN now.