Closed romainfrancois closed 2 years ago
@lionel- is there some alternative way to test for "this error originated from dplyr" ?
I don't think so. For which errors did we remove this class?
All of them. We used to mark errors with class = "dplyr_error"
but then we decided not to.
Can't we use cnd$trace$namespace
to figure out that abort()
was called from within dplyr
? That may be risky I guess.
library(dplyr, warn.conflicts = FALSE)
withCallingHandlers(summarise(mtcars, stop("boom")), error = function(cnd) {
print(cnd$trace$namespace)
})
#> [1] "base" "dplyr" "dplyr" "dplyr" "base" "dplyr" "base" "dplyr" NA
#> [10] "base"
#> Error: Problem while computing `..1 = stop("boom")`.
Created on 2021-11-17 by the reprex package (v2.0.1.9000)
Can't we use cnd$trace$namespace to figure out that abort() was called from within dplyr ? That may be risky I guess.
Yeah, that's not something we'd want to do or recommend. But in general there should be no need for testing that an error originates from a particular package.
For instance in this particular case I don't understand what's being tested because the expected error class is so general.
I suppose the idea of expect_error(count_gaps(tsbl, .name = NULL), class = "dplyr_error")
was testing that it's not an error from tsibble
but from dplyr
.
There are a few packages that do expect_error(class = "dplyr_error")
Yea, it's intended to test if the error is from {dplyr}.
When will the new version of {dplyr} be pushed to CRAN?
I can see what the test is doing but I don't understand the intent of the test. What dplyr error is expected? Maybe this should be turned into a snapshot test instead?
In general, I'm conservative about adding error classes because they instantaneously become part of the public interface and may be depended upon, as we see here. (And for a while in the past testthat encouraged users to add class =
arguments for classed errors, which worsened the issue.) For the same reason, we generally shouldn't remove classes. How many packages are depending on "dplyr_error"
?
How many packages are depending on "dplyr_error"?
Just a few. I agree that moving to snapshot tests will be more useful. We could add them back, but given that's just a few, I believe package maintainers can be convinced.
Also, I'm not even sure that all the errors from dplyr
used to have the class. Would it make some sense to have abort()
keep top_env(caller_env())
somewhere ? But you seem to rule this out here:
But in general there should be no need for testing that an error originates from a particular package.
@DavisVaughan might have views on this since another package using the class is yardstick
: https://github.com/tidymodels/yardstick/pull/239
Just a few. I agree that moving to snapshot tests will be more useful. We could add them back, but given that's just a few, I believe package maintainers can be convinced.
Sounds good.
Would it make some sense to have abort() keep top_env(caller_env()) somewhere ?
There used to be an rlang issue about keeping track of the package in error conditions but we never went through with it. This is probably not needed and this sort of public properties creates unexpected dependencies. For instance it might make it more difficult to move error generation to a backend package.
I think it was useful in yardstick to capture the text output of the error message, but I think testing for the error class explicitly was a mistake because yardstick doesn't own the "dplyr_error"
class.
I'm going to just make mine into snapshot tests, and I think that would be the most robust thing to do here too
ping @earowang. Any chance for a release with this fix ? We aim to release dplyr
early next week.
Thx. Will submit the new version to CRAN soon.
We're in the process of releasing dplyr 1.0.8, which will stop adding the
dplyr_error
class to errors. Please consider this pull request.