r-lib / rlang

Low-level API for programming with R
https://rlang.r-lib.org
Other
491 stars 131 forks source link

Support reticulate conditions #1664

Open hadley opened 8 months ago

hadley commented 8 months ago

e.g. https://github.com/tidyverse/purrr/issues/1104

lionel- commented 1 month ago

hmm... I'm very hesitant on this one. It might not be documented that conditions must be lists, but it's widely assumed.

@t-kalinowski does reticulate need reference semantics on the condition? Could you store an environment in the condition instead?

t-kalinowski commented 1 month ago

There are a couple of tricky questions to answer in reticulate if the R condition is derived from a Python Exception, but is not a bona-fide Python object.

Right now, you can catch a raised Python exception as an R condition, and resignal it.

cond <- tryCatch(
  py_run_string("raise ZeroDivisionError()"),
  python.builtin.ZeroDivisionError = function(cond) cond)
)

stop(cond)

If we convert the caught cond to an R list like this, tricky questions are <???>

structure(
  class = c(<???>, "error", "condition"),
  list(message = <...>, call = <...>,
       py_exception = structure(<environment>
                                class = c("python.builtin.ZeroDivisionError", 
                                          "python.builtin.ArithmeticError", 
                                          "python.builtin.Exception", 
                                          "python.builtin.BaseException", 
                                          "python.builtin.object", 
                                          <???>))
       )
  )

From the users perspective,

I'd lean towards answering "yes" for both questions, keeping this an invisible implementation detail.

Another tricky question: what about an exception that was never raised?

exception <- py_eval("ZeroDivisionError()")

Is exception an environment or a list? Does it also inherit from error and condition? (Can it be raised with stop()?) Most likely we make this a list, for consistency with raised exceptions.

This all would require a non-negligible refactoring in reticulate; we would have to do normalization behind the scenes at all Python entry points. Note, we already do some normalization since we have already have two R types that can be Python objects (environments and closures), so adding a third isn't too bad, but still not trivial.

lionel- commented 1 month ago

I can see how that would be tricky to change now. But the alternative of having to consider environment conditions in other parts of our code seems worrying as well, unless we decide conditions should only be manipulated at R level or by calling the S3 API. Which might be fine, but would require a review of the rlang C API and maybe the vctrs internals (which also uses the rlang API and has a bunch of error handling at C level).

Regarding the design questions you brought up:

t-kalinowski commented 1 month ago

A condition class doesn't need to have message and call fields, it just needs to implement conditionMessage() and conditionCall().

In the initial implementation in reticulate, I only implemented conditionMessage() and conditionCall() for exceptions, but quickly ran into errors when other code (I forget where exactly, maybe rlang?), was calling x$message instead of conditionMessage().

It makes sense to have two classes,...

Sorry, I don't fully understand. Are you suggesting that both the inner environment and the outer list share the same S3 class? I.e.,

class <-  c("python.builtin.ZeroDivisionError", 
            "python.builtin.ArithmeticError", 
            "python.builtin.Exception", 
            "python.builtin.BaseException", 
            "python.builtin.object", 
            "error", "condition")
structure(
  class = class,
  list(message = <...>, call = <...>,
       py_exception = structure(<environment>
                                class = class))
)
t-kalinowski commented 1 month ago

What do you think about attaching the python exception ref environment as an attribute of the R condition, instead of as a list element.

t-kalinowski commented 1 month ago

Ok, I have a tentative fix in reticulate.

A recent refactor of py_to_r()/r_to_py() and the PyObjectRef class in reticulate made this change much easier than it would have been otherwise.

This should be a mostly invisible change from a users perspective, so long as they were not previously inspecting typeof(<exception>). We'll see if any bugs or changes in edge cases shake out after the change.

Note, we're attaching the R environment w/ the extptr as an attribute of the R condition list object. In initial testing it didn't seem that rlang or purrr has any issues with that.

lionel- commented 1 month ago

Good news! Out of curiosity, why not make it a condition field?

t-kalinowski commented 1 month ago

Convenience mostly. An attribute is how Python callables keep a reference to the environment with the external pointer; making conditions consistent with that keeps the reticulate internals simpler and lets the objects share some common code paths.

Also, the condition "namespace" can be crowded, since it includes everything from the Python Exception instance object, in addition to the R call and message. Keeping the Python refenv as an attribute will minimize the (admittedly low) risk for namespace collisions.