Open hadley opened 8 months ago
hmm... I'm very hesitant on this one. It might not be documented that conditions must be lists, but it's widely assumed.
@t-kalinowski does reticulate need reference semantics on the condition? Could you store an environment in the condition instead?
There are a couple of tricky questions to answer in reticulate if the R condition is derived from a Python Exception, but is not a bona-fide Python object.
Right now, you can catch a raised Python exception as an R condition, and resignal it.
cond <- tryCatch(
py_run_string("raise ZeroDivisionError()"),
python.builtin.ZeroDivisionError = function(cond) cond)
)
stop(cond)
If we convert the caught cond
to an R list like this, tricky questions are <???>
structure(
class = c(<???>, "error", "condition"),
list(message = <...>, call = <...>,
py_exception = structure(<environment>
class = c("python.builtin.ZeroDivisionError",
"python.builtin.ArithmeticError",
"python.builtin.Exception",
"python.builtin.BaseException",
"python.builtin.object",
<???>))
)
)
From the users perspective,
cond
list a Python object? (safe to use in all places where we accept a python object?).py_exception
environment equivalent to the cond
list? Does the py_exception
environment also have S3 class `"error" "condition"? I'd lean towards answering "yes" for both questions, keeping this an invisible implementation detail.
Another tricky question: what about an exception that was never raised?
exception <- py_eval("ZeroDivisionError()")
Is exception
an environment or a list? Does it also inherit from error
and condition
? (Can it be raised with stop()
?) Most likely we make this a list, for consistency with raised exceptions.
This all would require a non-negligible refactoring in reticulate; we would have to do normalization behind the scenes at all Python entry points. Note, we already do some normalization since we have already have two R types that can be Python objects (environments and closures), so adding a third isn't too bad, but still not trivial.
I can see how that would be tricky to change now. But the alternative of having to consider environment conditions in other parts of our code seems worrying as well, unless we decide conditions should only be manipulated at R level or by calling the S3 API. Which might be fine, but would require a review of the rlang C API and maybe the vctrs internals (which also uses the rlang API and has a bunch of error handling at C level).
Regarding the design questions you brought up:
It makes sense to have two classes, one public inheriting from condition
and one private which inherits from environment
and wrapped by the former. A non-raised exception would still be constructed with both the public and private objects.
A condition class doesn't need to have message
and call
fields, it just needs to implement conditionMessage()
and conditionCall()
.
A condition class doesn't need to have message and call fields, it just needs to implement conditionMessage() and conditionCall().
In the initial implementation in reticulate, I only implemented conditionMessage()
and conditionCall()
for exceptions, but quickly ran into errors when other code (I forget where exactly, maybe rlang?), was calling x$message
instead of conditionMessage()
.
It makes sense to have two classes,...
Sorry, I don't fully understand. Are you suggesting that both the inner environment and the outer list share the same S3 class? I.e.,
class <- c("python.builtin.ZeroDivisionError",
"python.builtin.ArithmeticError",
"python.builtin.Exception",
"python.builtin.BaseException",
"python.builtin.object",
"error", "condition")
structure(
class = class,
list(message = <...>, call = <...>,
py_exception = structure(<environment>
class = class))
)
What do you think about attaching the python exception ref environment as an attribute of the R condition, instead of as a list element.
Ok, I have a tentative fix in reticulate.
A recent refactor of py_to_r()/r_to_py()
and the PyObjectRef
class in reticulate made this change much easier than it would have been otherwise.
This should be a mostly invisible change from a users perspective, so long as they were not previously inspecting typeof(<exception>)
. We'll see if any bugs or changes in edge cases shake out after the change.
Note, we're attaching the R environment w/ the extptr as an attribute of the R condition list object. In initial testing it didn't seem that rlang or purrr has any issues with that.
Good news! Out of curiosity, why not make it a condition field?
Convenience mostly. An attribute is how Python callables keep a reference to the environment with the external pointer; making conditions consistent with that keeps the reticulate internals simpler and lets the objects share some common code paths.
Also, the condition "namespace" can be crowded, since it includes everything from the Python Exception instance object, in addition to the R call
and message
. Keeping the Python refenv as an attribute will minimize the (admittedly low) risk for namespace collisions.
e.g. https://github.com/tidyverse/purrr/issues/1104