r-devel / r-project-sprint-2023

Material for the R project sprint
https://contributor.r-project.org/r-project-sprint-2023/
17 stars 3 forks source link

Refine exception (error) signaling in the R codebase #62

Open gmbecker opened 1 year ago

gmbecker commented 1 year ago

Endorsed by Luke

Review use of stop() and warning() to see if/where signaling structured (ie specially classed) errors would be beneficial.

MichaelChirico commented 1 year ago

@ltierney and I discussed this briefly yesterday. One idea that came up is to look for usages of tryCatch() where a handler makes a call to grep() or grepl() on the error message -- this is both (1) bad practice (because of message translation) and (2) potentially a good signal of where people would benefit from a classed condition.

I suggested we could "easily" find these with a static analysis. I've written the following XPath (for searching R ASTs generated by {xmlparsedata}):

//SYMBOL_FUNCTION_CALL[text() = 'tryCatch']
  /parent::expr
  /following-sibling::SYMBOL_SUB
  /following-sibling::expr[1][FUNCTION or OP-LAMBDA]
  /expr[last()]
  //SYMBOL_FUNCTION_CALL[text() = 'grep' or text() = 'grepl']
  /parent::expr
  /following-sibling::expr[STR_CONST]

First I ran this inside Google, consisting of internal user code and a "peculiar" snapshot of CRAN+Bioconductor (O(1K) packages). Out of 1588 files using tryCatch(), 32 hit the above XPath. Here are some examples from CRAN:

This needs refinement:

  1. It's hard to tell who's generating the messages being tested for. We could exclude matches if they don't match any msgid in the base .pot files.
  2. It should be run on "real CRAN". E.g. here are roughly 3,500 files worth checking: https://github.com/search?q=org%3Acran+-path%3Atests+tryCatch+grep+function&type=code