Closed hadley closed 5 years ago
Hi, thanks for letting me know!
I could reproduce the error. It occurs because tidyr::nest_
now drops additional class values when it nests a data.frame
. My roc
function returns a data.frame
with the additional class roc_cutpointr
that got lost after nesting, thus the respective error message.
The dev version of nest
still preserves additional classes when nesting a tibble
, please see the example below. It doesn't seem to make a difference whether I use nest()
or nest_()
.
I don't depend on the behavior of tidyr <= 0.8.2
(probably my roc()
should return a tibble
anyway), but the new behavior seems unexpected to me. Thanks again for the detailed report.
#
# With tidyr 0.8.2 from CRAN
#
library(tidyr)
# Nesting with a data.frame that has an additional class
my_object <- iris
class(my_object) <- c(class(my_object), "my_class")
nested_object <- nest_(my_object, key_col = "foo")
purrr::map(nested_object$foo, class)
# [1] "data.frame" "my_class"
# Nesting with a tibble that has an additional class
my_object <- tibble::as_tibble(iris)
class(my_object) <- c(class(my_object), "my_class")
nested_object <- nest_(my_object, key_col = "foo")
purrr::map(nested_object$foo, class)
# [1] "tbl_df" "tbl" "data.frame" "my_class"
#
# With tidyr 0.8.2.9000 from Github
#
library(tidyr)
# Nesting with a data.frame that has an additional class
my_object <- iris
class(my_object) <- c(class(my_object), "my_class")
nested_object <- nest_(my_object, key_col = "foo")
purrr::map(nested_object$foo, class)
# [1] "tbl_df" "tbl" "data.frame"
# So now it's a tibble and has lost my_class
# Nesting with a tibble that has an additional class
my_object <- tibble::as_tibble(iris)
class(my_object) <- c(class(my_object), "my_class")
nested_object <- nest_(my_object, key_col = "foo")
purrr::map(nested_object$foo, class)
# [1] "tbl_df" "tbl" "data.frame" "my_class"
# Still a tibble and has preserved my_class
Ah, the problem is that nest()
now always returns tibbles because the print method for data frames containing list columns is not very good. I'll change it to apply the transformation only when the input is a data frame, not a subclass, which should resolve this problem.
Ok, that fixed the failure. I still see a different failure, but that seems less unlikely to be related to tidyr:
checking tests ...
ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
`print\(scp\)` does not match "accuracy_oob 0.8201".
Actual value: "Method: oc_youden_normal \\nPredictor: dsi \\nOutcome: suicide \\nDirection: >= \\nSubgroups: female, male \\nNr\. of bootstraps: 10 \\n\\nSubgroup: female \\n-------------------------------------------------------------------------------- \\n optimal_cutpoint accuracy acc sensitivity specificity AUC n_pos n_neg\\n 2\.4778 0\.8954 0\.8954 0\.8148 0\.9014 0\.9446 27 365\\n\\nCutpoint 2\.47775393352595:\\n observation\\nprediction yes no\\n yes 22 36\\n no 5 329\\n\\n\\nPredictor summary: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max\. SD\\n 0 0 0 0 0\.8393 1 5 10 1\.7452\\n\\nPredictor summary per class: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max SD\\nno 0 0\.0 0 0 0\.5479 0 4 10 1\.3181\\nyes 0 1\.3 4 5 4\.7778 6 7 9 2\.0444\\n\\nBootstrap summary: \\n# A tibble: 13 x 10\\n Variable Min\. `5%` `1st Qu\.` Median Mean `3rd Qu\.` `95%` Max\. SD\\n <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\\n 1 optimal_cutp… 2\.18 2\.23 2\.33 2\.43 2\.47 2\.51 2\.83 2\.94 0\.218 \\n 2 AUC_b 0\.941 0\.943 0\.950 0\.964 0\.960 0\.967 0\.974 0\.976 0\.0119\\n 3 AUC_oob 0\.894 0\.894 0\.912 0\.924 0\.925 0\.939 0\.955 0\.956 0\.0222\\n 4 accuracy_b 0\.860 0\.871 0\.888 0\.908 0\.904 0\.923 0\.927 0\.929 0\.0226\\n 5 accuracy_oob 0\.820 0\.838 0\.873 0\.876 0\.880 0\.901 0\.912 0\.914 0\.0278\\n 6 acc_b 0\.860 0\.871 0\.888 0\.908 0\.904 0\.923 0\.927 0\.929 0\.0226\\n 7 acc_oob 0\.820 0\.838 0\.873 0\.876 0\.880 0\.901 0\.912 0\.914 0\.0278\\n 8 sensitivity_b 0\.708 0\.737 0\.779 0\.823 0\.826 0\.851 0\.940 0\.954 0\.0728\\n 9 sensitivity_… 0\.625 0\.644 0\.762 0\.809 0\.800 0\.872 0\.913 0\.923 0\.0971\\n10 specificity_b 0\.870 0\.875 0\.894 0\.915 0\.909 0\.927 0\.931 0\.932 0\.0223\\n11 specificity_… 0\.835 0\.845 0\.876 0\.880 0\.886 0\.912 0\.921 0\.922 0\.0283\\n12 kappa_b 0\.321 0\.329 0\.423 0\.509 0\.485 0\.562 0\.590 0\.610 0\.0995\\n13 kappa_oob 0\.305 0\.324 0\.368 0\.420 0\.444 0\.511 0\.608 0\.631 0\.106 \\n\\nSubgroup: male \\n-------------------------------------------------------------------------------- \\n optimal_cutpoint accuracy acc sensitivity specificity AUC n_pos n_neg\\n 3\.1723 0\.8643 0\.8643 0\.6667 0\.8779 0\.8617 9 131\\n\\nCutpoint 3\.17225507835137:\\n observation\\nprediction yes no\\n yes 6 16\\n no 3 115\\n\\n\\nPredictor summary: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max\. SD\\n 0 0 0 0 1\.15 1 6 11 2\.1151\\n\\nPredictor summary per class: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max SD\\nno 0 0\.0 0 0 0\.8702 1 5\.0 6 1\.6286\\nyes 0 0\.4 3 4 5\.2222 8 10\.6 11 3\.8333\\n\\nBootstrap summary: \\n# A tibble: 13 x 10\\n Variable Min\. `5%` `1st Qu\.` Median Mean `3rd Qu\.` `95%` Max\. SD\\n <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\\n 1 optimal_cutp… 2\.82 2\.84 2\.92 3\.27 3\.26 3\.55 3\.82 3\.90 0\.387 \\n 2 AUC_b 0\.758 0\.787 0\.825 0\.879 0\.871 0\.904 0\.959 0\.968 0\.0641\\n 3 AUC_oob 0\.631 0\.691 0\.792 0\.885 0\.859 0\.943 0\.972 0\.977 0\.109 \\n 4 accuracy_b 0\.807 0\.814 0\.834 0\.864 0\.852 0\.871 0\.871 0\.871 0\.0243\\n 5 accuracy_oob 0\.822 0\.823 0\.839 0\.871 0\.866 0\.896 0\.905 0\.906 0\.0327\\n 6 acc_b 0\.807 0\.814 0\.834 0\.864 0\.852 0\.871 0\.871 0\.871 0\.0243\\n 7 acc_oob 0\.822 0\.823 0\.839 0\.871 0\.866 0\.896 0\.905 0\.906 0\.0327\\n 8 sensitivity_b 0\.556 0\.582 0\.667 0\.703 0\.735 0\.794 0\.936 1 0\.129 \\n 9 sensitivity_… 0\.333 0\.363 0\.5 0\.667 0\.707 1 1 1 0\.272 \\n10 specificity_b 0\.817 0\.825 0\.846 0\.867 0\.862 0\.875 0\.892 0\.898 0\.0246\\n11 specificity_… 0\.818 0\.826 0\.853 0\.887 0\.877 0\.898 0\.917 0\.918 0\.0342\\n12 kappa_b 0\.210 0\.220 0\.243 0\.338 0\.319 0\.380 0\.407 0\.411 0\.0757\\n13 kappa_oob 0\.118 0\.145 0\.208 0\.306 0\.310 0\.398 0\.497 0\.570 0\.139 "
[31m──[39m [31m3. Failure: summary is printed correctly (@test-cutpointr.R#1211) [39m [31m───────────────────────────────────────────[39m
`print\(scp\)` does not match "accuracy_oob 0.8163".
Actual value: "Method: oc_youden_normal \\nPredictor: x \\nOutcome: class \\nDirection: >= \\nSubgroups: female, male \\nNr\. of bootstraps: 10 \\n\\nSubgroup: female \\n-------------------------------------------------------------------------------- \\n optimal_cutpoint accuracy acc sensitivity specificity AUC n_pos n_neg\\n 2\.4778 0\.8954 0\.8954 0\.8148 0\.9014 0\.9446 27 365\\n\\nCutpoint 2\.47775393352595:\\n observation\\nprediction yes no\\n yes 22 36\\n no 5 329\\n\\n\\nPredictor summary: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max\. SD\\n 0 0 0 0 0\.8393 1 5 10 1\.7452\\n\\nPredictor summary per class: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max SD\\nno 0 0\.0 0 0 0\.5479 0 4 10 1\.3181\\nyes 0 1\.3 4 5 4\.7778 6 7 9 2\.0444\\n\\nBootstrap summary: \\n# A tibble: 13 x 10\\n Variable Min\. `5%` `1st Qu\.` Median Mean `3rd Qu\.` `95%` Max\. SD\\n <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\\n 1 optimal_cutp… 2\.02 2\.12 2\.32 2\.40 2\.40 2\.54 2\.62 2\.66 0\.185 \\n 2 AUC_b 0\.907 0\.910 0\.92 0\.950 0\.940 0\.958 0\.965 0\.966 0\.0227\\n 3 AUC_oob 0\.898 0\.905 0\.931 0\.953 0\.947 0\.968 0\.978 0\.980 0\.0274\\n 4 accuracy_b 0\.878 0\.878 0\.895 0\.902 0\.900 0\.908 0\.916 0\.921 0\.0138\\n 5 accuracy_oob 0\.865 0\.868 0\.879 0\.888 0\.891 0\.906 0\.914 0\.917 0\.0176\\n 6 acc_b 0\.878 0\.878 0\.895 0\.902 0\.900 0\.908 0\.916 0\.921 0\.0138\\n 7 acc_oob 0\.865 0\.868 0\.879 0\.888 0\.891 0\.906 0\.914 0\.917 0\.0176\\n 8 sensitivity_b 0\.66 0\.689 0\.759 0\.786 0\.796 0\.849 0\.896 0\.917 0\.076 \\n 9 sensitivity_… 0\.7 0\.712 0\.8 0\.847 0\.861 0\.972 1 1 0\.112 \\n10 specificity_b 0\.878 0\.881 0\.901 0\.913 0\.910 0\.922 0\.934 0\.939 0\.019 \\n11 specificity_… 0\.864 0\.867 0\.882 0\.892 0\.895 0\.909 0\.925 0\.926 0\.0216\\n12 kappa_b 0\.362 0\.410 0\.475 0\.528 0\.514 0\.566 0\.582 0\.585 0\.0692\\n13 kappa_oob 0\.160 0\.214 0\.391 0\.420 0\.404 0\.475 0\.524 0\.539 0\.112 \\n\\nSubgroup: male \\n-------------------------------------------------------------------------------- \\n optimal_cutpoint accuracy acc sensitivity specificity AUC n_pos n_neg\\n 3\.1723 0\.8643 0\.8643 0\.6667 0\.8779 0\.8617 9 131\\n\\nCutpoint 3\.17225507835137:\\n observation\\nprediction yes no\\n yes 6 16\\n no 3 115\\n\\n\\nPredictor summary: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max\. SD\\n 0 0 0 0 1\.15 1 6 11 2\.1151\\n\\nPredictor summary per class: \\n Min\. 5% 1st Qu\. Median Mean 3rd Qu\. 95% Max SD\\nno 0 0\.0 0 0 0\.8702 1 5\.0 6 1\.6286\\nyes 0 0\.4 3 4 5\.2222 8 10\.6 11 3\.8333\\n\\nBootstrap summary: \\n# A tibble: 13 x 10\\n Variable Min\. `5%` `1st Qu\.` Median Mean `3rd Qu\.` `95%` Max\. SD\\n <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\\n 1 optimal_cutp… 2\.14 2\.26 2\.93 3\.05 2\.97 3\.28 3\.35 3\.36 0\.403 \\n 2 AUC_b 0\.738 0\.760 0\.823 0\.848 0\.852 0\.904 0\.925 0\.929 0\.0611\\n 3 AUC_oob 0\.806 0\.815 0\.838 0\.901 0\.899 0\.958 0\.990 1 0\.0688\\n 4 accuracy_b 0\.8 0\.8 0\.848 0\.868 0\.854 0\.871 0\.875 0\.879 0\.0298\\n 5 accuracy_oob 0\.816 0\.820 0\.835 0\.87 0\.862 0\.877 0\.899 0\.917 0\.031 \\n 6 acc_b 0\.8 0\.8 0\.848 0\.868 0\.854 0\.871 0\.875 0\.879 0\.0298\\n 7 acc_oob 0\.816 0\.820 0\.835 0\.87 0\.862 0\.877 0\.899 0\.917 0\.031 \\n 8 sensitivity_b 0\.333 0\.376 0\.542 0\.690 0\.656 0\.744 0\.9 1 0\.192 \\n 9 sensitivity_… 0\.5 0\.545 0\.617 0\.8 0\.777 0\.95 1 1 0\.183 \\n10 specificity_b 0\.806 0\.807 0\.865 0\.876 0\.864 0\.879 0\.894 0\.903 0\.0316\\n11 specificity_… 0\.808 0\.823 0\.852 0\.874 0\.870 0\.886 0\.909 0\.909 0\.031 \\n12 kappa_b 0\.133 0\.135 0\.154 0\.264 0\.264 0\.364 0\.416 0\.436 0\.116 \\n13 kappa_oob 0\.140 0\.192 0\.318 0\.448 0\.405 0\.493 0\.575 0\.625 0\.143 "
══ testthat results ════════════════════════════════════════════════════════════════════════════════════════════
OK: 369 SKIPPED: 0 FAILED: 3
1. Failure: summary is printed correctly (@test-cutpointr.R#1179)
2. Failure: summary is printed correctly (@test-cutpointr.R#1195)
3. Failure: summary is printed correctly (@test-cutpointr.R#1211)
Error: testthat unit tests failed
Execution halted
I see. I could also reproduce that. This error is still related to the update of tidyr
:
I have data frames with bootstrap results that I nest. The summary function simply prints them. Since these are regular data frames they get converted to tibbles now and then some of the numbers get rounded, so they don't match the expected values from the test.
So this is not really an error, but rather a design decision, I guess. If data frames get converted to tibbles now, I could for example do a print.data.frame
to get the previous output without tibble's rounding (I always rounded to 4 digits) or rewrite the test.
Yeah, I'd say that's something you should fix with the test — testing printed output tends to be quite fragile, and ideally you would be testing the behaviour of your code, not tidyr's code.
OK, I'll probably just rewrite the test (and maybe some of the functions) and push updates to Github and CRAN very soon. Thanks for the quick fix!
Thanks! I'll probably push tidyr to CRAN today (because it's causing problems with the latest dplyr), but I'll let them know that you're working on a fix.
cutpointr 0.7.6 is on CRAN and now passes the CRAN checks.
When I check cutpointr with the dev version of tidyr, I see:
checking examples ... ERROR
checking tests ...
checking re-building of vignette outputs ... WARNING
Would you mind looking into this for me? It's possible that I've accidentally changed the API tidyr in someway but the changes are small and cutpointr is the only CRAN package that shows problems.