Appsilon / box.linters

lintr-compatible linters for box modules in R
https://appsilon.github.io/box.linters/
4 stars 0 forks source link

[LINT_BUG]: Linter does not see {object}s in strings #96

Closed radbasa closed 1 month ago

radbasa commented 1 month ago

box.linters version

0.9.0.9004

Sample source code to lint

box::use(
  glue[glue],
)

value <- 3

glue("This {value} should be valid.")

Lint command used

lintr::lint(
  code,
  linters = lintr::linters_with_defaults(defaults = box.linters::box_default_linters)
)

Lint result

<text>:6:1: warning: [unused_declared_object_linter] Declared function/object unused.
value <- 3
^~~~~

Expected result

value should not lint. It is used in the glue() statement as {value}.

radbasa commented 1 month ago

XML node for the glue call:

<expr line1="5" col1="3" line2="5" col2="45" start="233" end="275">
          <expr line1="5" col1="3" line2="5" col2="12" start="233" end="242">
            <SYMBOL_PACKAGE line1="5" col1="3" line2="5" col2="6" start="233" end="236">glue</SYMBOL_PACKAGE>
            <NS_GET line1="5" col1="7" line2="5" col2="8" start="237" end="238">::</NS_GET>
            <SYMBOL_FUNCTION_CALL line1="5" col1="9" line2="5" col2="12" start="239" end="242">glue</SYMBOL_FUNCTION_CALL>
          </expr>
          <OP-LEFT-PAREN line1="5" col1="13" line2="5" col2="13" start="243" end="243">(</OP-LEFT-PAREN>
          <expr line1="5" col1="14" line2="5" col2="44" start="244" end="274">
            <STR_CONST line1="5" col1="14" line2="5" col2="44" start="244" end="274">"This {value} should be valid."</STR_CONST>
          </expr>
          <OP-RIGHT-PAREN line1="5" col1="45" line2="5" col2="45" start="275" end="275">)</OP-RIGHT-PAREN>
        </expr>

rhino::log$debug() will be similar. The {value} is in a string constant.

lintr::object_usage_linter() can only handle glue::glue(). We'll need a fully custom handler.

radbasa commented 1 month ago

{cli} and {logger} depend on {glue} to parse objects in string templates.

Took a look inside {glue} to see if we can re-use internal functions. There are only two real glue functions, glue() which just calls glue_data(). All other glue_functions() are glue transformers passed to glue() or glue_data(). glue_data() performs its magic by calling a C function, glue_. There is no intermediate step.

radbasa commented 1 month ago

The trick now is how do we extract the text between { and } inside string constants:

code_1 <- "This {value} should {value_b} be valid."

code_2 <- "This {{value}} should not be extracted. Literal braces."

code-3 <- "This {
{
  value_a
  value_b
}
}
is valid.
"

code_4 <- "This { {
  value_a
  value_b
} }
is valid.
"

code_5 <- "This {
{
  value_a
  func(value_b)
}
}
should be valid.
"

Step 1. Extract text between { and }

all_text_between_braces <- stringr::str_match_all(code, "(\\{(?:\\{??[^\\{]*?\\}))")

Regex source

The text we want goes into the second column. [,2].

Step 2. Literal braces in glue strings should not be treated as R objects or code. The rest should be treated as R code.

code_3 returns {{value}. Notice the unbalanced braces.

Step 3. Parse extracted text as code.

xmlparsedata::xml_parse_data(parse(text = all_text_between_braces[[1]][,2]))

xml_parse_data() will throw an error with unbalanced braces like with code_3. We just catch this and ignore all text between {{ }}.

Step 4. We now have code in AST XML form. This we can work with.