r-lib / xml2

Bindings to libxml2
https://xml2.r-lib.org/
Other
218 stars 83 forks source link

xml_descendants() as akin to xml_child()/xml_children() and equivalent to xml_find_all(., "descendant::*") #409

Open MichaelChirico opened 11 months ago

MichaelChirico commented 11 months ago

Follow-up to #8.

Consider trying to find the minimal indentation of expressions in an R file (using the AST via {xmlparsedata}). I'll use dbplyr/tests/testthat/helper-src.R for illustration:

library(xmlparsedata)
library(xml2)

xml <- "https://raw.githubusercontent.com/tidyverse/dbplyr/1e48cfaf795f9101792567bc68ce7d24a19db9d5/tests/testthat/helper-src.R" |>
 parse() |>
 xml_parse_data() |>
 read_xml()
Objective: comment calls to test_register_con(), besides the RSQLite one, with the correct indentation
xpath <- "
//SYMBOL_FUNCTION_CALL[text() = 'test_register_con']
    /parent::expr
    /parent::expr[not(.//SYMBOL_PACKAGE[text() = 'RSQLite'])]
"

exprs <- xml_find_all(xml, xpath)

The part about indentation requires examining the col1 attribute for all descendant nodes of exprs. Here's how we can do this currently:

col1s <- vapply(
  exprs,
  \(expr) min(as.integer(xml_attr(xml_find_all(expr, ".//*"), "col1"))),
  integer(1L)
)

If we have xml_descendants(), this can be a tad cleaner + more readable:

col1s <- vapply(
  exprs,
  \(expr) min(as.integer(xml_attr(xml_descendants(expr), "col1"))),
  integer(1L)
)