ropensci-review-tools / pkgstats

Historical statistics of every R package ever
https://docs.ropensci.org/pkgstats/
17 stars 1 forks source link

Counts of external calls may be inaccurate #61

Closed pawelru closed 2 months ago

pawelru commented 2 months ago

I have found the calculation of external calls quite inaccurate. Please find below a few examples.

Reading the source code I am aware that it's probably an imperfection of external dependent ctags. It might do well other things (like accounting internal vars) but for counting dependent packages I have found it not detecting occurrences which results in undercounting totals or not including some at all.

It's difficult to ask you to fix it and I'm just sharing my findings with you. Maybe you can consider alternative ways of getting what's needed? I'm aware of relatively new treesitter but I personally have no experience in it. Maybe checkglobals show here?

Checking a randomly selected teal repo that I'm working on:

(where x is defined as per the vignette)

I'm aware of some differences, e.g. you include R/ dir only while checkglobals look into all dirs (incl. tests, vignettes etc.) but nevertheless the latter seems to be more accurate to me. A few examples using imperfect regexp on my local clone just to show you why I think that the values are incorrect:

R//teal_lockfile.R: renv::snapshot( R//teal_lockfile.R: # type = is taken from renv::settings$snapshot.type()


- `shinyjs`
    - `checkglobals` reports 8 unique
    - `pkgstats` reports 2 unique
    - GH search (includes also occurences in docs) - [link](https://github.com/search?q=repo%3Ainsightsengineering%2Fteal+shinyjs%3A%3A+path%3A%2F%5ER%5C%2F%2F&type=code)
    - local search in `R/`

❯ grep -r "^[^#].*shinyjs::" R/

R//utils.R: shinyjs::runjs(script) # function does not return anything R//include_css_js.R: shinyjs::runjs(paste0(readLines(system.file("js", file, package = "teal", mustWork = TRUE)), collapse = "\n")) R//include_css_js.R: shinyjs::useShinyjs(), R//include_css_js.R: shinyjs::hidden(icon("fas fa-gear")), # add hidden icon to load font-awesome css for icons R//module_snapshot_manager.R: shinyjs::click(id = "teal-wunder_bar-show_snapshot_manager", asis = TRUE) R//module_init_data.R: shinyjs::showLog() R//module_init_data.R: shinyjs::disable(selector = sprintf(".teal-body:has('#%s') .nav li a", session$ns("content"))) R//module_init_data.R: shinyjs::enable(selector = sprintf(".teal-body:has('#%s') .nav li a", session$ns("content"))) R//module_init_data.R: shinyjs::hide( R//module_init_data.R: shinyjs::runjs(

mpadge commented 2 months ago

@pawelru First of all please note you are referring throughout to open-source software maintained willingly by nice people trying to do nice things. Asserting in an issue title that a package or function "is inaccurate*, and then claiming as your opening sentence,

I have found the calculation of external calls quite inaccurate

is neither nice nor respectful. I am tempted to close this issue for this reason alone, but fortunately for you am intrigued by what your report, and think it is worth further consideration here. In future, please do try to be more respectful towards voluntary open-source maintainers.


Some points at the outset:

  1. Counting function calls from static analysis is never perfectly accurate. The checkglobals package asserts that it "approximately detect[s] global and imported functions."
  2. pkgstats considers code from any and all possible locations, including /inst and /vignettes directories. contrary to what you say above.
  3. The general process by which these things are tagged is referred to as object-reference tagging. The only tags of interest here are actual function calls; there may be instances of references to objects which have the names of functions, but for which those references are not actual function calls. These should not be tagged. This distinction is made in more detail below.
  4. ctags has been maintained for years, with over 10,000 commits, and years of very detailed debugging and refining. (I have also extensively used tree-sitter, which I am confident would generate similar results to ctags, yet is less suited to the task here.)

On to your comparisons:

teal.reporter

The 4 lines tagged by checkglobals are:

# in teal_reporter.R
TealReportCard <- R6::R6Class( # nolint: object_name.
  classname = "TealReportCard",
  inherit = teal.reporter::ReportCard,...
# in teal_reporter.R
TealSlicesBlock <- R6::R6Class( # nolint: object_name_linter.
  classname = "TealSlicesBlock",
  inherit = teal.reporter:::TextBlock, ...
# in init.R
  reporter <- teal.reporter::Reporter$new()$set_id(attr(filter, "app_id"))
# in reporter_previewer_module.R
  checkmate::assert_true(all(names(server_args) %in% names(formals(teal.reporter::reporter_previewer_srv))))

Of these four, pkgstats only tags the single instance in init.R, which is clearly a call to a function. The first two references identified by checkglobals are only inheritance templates for an R6 class. These are not function calls, and aspects of the ineheritance may or may not even be called upon during code execution. ctags appropriately does not identify these as function calls. The final instance, in reporter_previewer_module.R, is also clearly not an actual call to a teal.reporter function, rather just a line used to extract the formal arguments to that function. Again, ctags appropriately excludes this, and ultimately from these four only identifies one as a function call. I entirely agree with that.

renv

This case is more interesting. The use of renv is in teal_lockfile.R:

    renv_logs <- utils::capture.output(
      renv::snapshot(
        lockfile = lockfile_path,
        prompt = FALSE,
        force = TRUE
        # type = is taken from renv::settings$snapshot.type()
      )
    )

ctags in this case entirely fails to detect the renv::snapshot call, yet picks up all other lines. I'm not sure why this may be, and may either represent a genuine tagging omission from ctags, or it may be that it appropriately identifies that function as being called there only for the side-effect of capturing output? Even I am unsure whether or not I would consider that a genuine function call.

shinyjs

checkglobals detects 8 calls to shinyjs, while pkgstats only detects two. These two are:

# in module_snapshot_manager.R
        shinyjs::click(id = "teal-wunder_bar-show_snapshot_manager", asis = TRUE)
# in module_init_data.R
    if (getOption("teal.show_js_log", default = FALSE)) {
      shinyjs::showLog()
    }

Those are obviously direct function calls, and should be tagged as such, which they are. The additional checkglobals output not tagged by ctags includes function calls from module_init_data.R and include_css_js.R. The latter file includes two shinyjs function names in this definition:

include_teal_css_js <- function() {
  tagList(
    shinyjs::useShinyjs(),
    include_css_files(),
    # init.js is executed from the server
    include_js_files(except = "init.js"),
    shinyjs::hidden(icon("fas fa-gear")), # add hidden icon to load font-awesome css for icons
  )
}

These are not tagged by ctags, which I suspect is appropriate here, as they are merely passed as arguments to a tagList() function. Those arguments then may or may not actually be called, so it is appropriate for a static code analyser to exclude those from identified function calls. The other call in that file is in these lines:

run_js_files <- function(files) {
  checkmate::assert_character(files, min.len = 1, any.missing = FALSE)
  lapply(files, function(file) {
    shinyjs::runjs(paste0(readLines(system.file("js", file, package = "teal", mustWork = TRUE)), collapse = "\n"))
  })
  invisible(NULL)
}

ctags does then entirely miss that definition, and only tags the call to lapply, without parsing any of the content of what is applied to each file. So in that case I would agree with you that ctags, and therefore pkgstats, does miss one external call there. I'll keep this issue open to investigate that further.


More generally, I hope that this has made you aware that identifying function calls is not a trivial exercise. Note in general that my tendency would be to place greater trust in anything that provides lower rather than higher estimates of numbers of function calls. Especially in a dynamically-typed and functional language like R, many references to functions will generally not be function calls, and should not be tagged as such. ctags here does indeed do a better job of identifying such instances than checkglobals. I'll report back on the one example above, which should also give insight into other discrepancies.

pawelru commented 2 months ago

Thanks @mpadge for a prompt and very detailed response - I really appreciate that. Especially that you actually went beyond into the referenced codebase and explain individual examples. Very much appreciated.

Let me begin by saying that I never intended to offend anyone in my previous message. My aim was to simply share some feedback on things that I found concerning. I believe that the constructive feedback is essential, especially in the context of open-sourcing. By this, I mean reports, not criticism. I am sorry if my words were misunderstood.

I totally agree with you that this is not a trivial problem. My background is that I am looking for a measures of a dependency of a package. My first idea was to measure the usage of the objects and I am currently experimenting with different tools. Your message makes me realise that this (ctags) is meant for something different - i.e. counting external function calls which in some cases makes a big difference - e.g. whether or not to include foo::bar. On that note, I think that ctags results are correct. As you explained in details. I was just looking for something different and unintentionally misunderstood the results.

mpadge commented 2 months ago

No worries @pawelru, and thank you for your considered response here. Good to hear that you generally understand what I was trying to set out in the detailed response above. The occasional missing function calls from ctags are nevertheless something I've been grappling with for a long time here, so i'll keep this issue open to dig a bit further here.

mpadge commented 2 months ago

@pawelru I'm going to close this now, but will continue to address the issue via #4, where I've already pinged you. Thanks for finally giving me the incentive to pursue that!

mpadge commented 2 months ago

@pawelru I've implemented tree-sitter parsing of external calls in #62, which currently gives these results for your package:

path <- "/<local>/<path>/<to>/teal"
has_tabs <- FALSE
pkg_name <- basename (path)

# The current pkgstats approach with 'ctags':
tags_r <- withr::with_dir (path, get_ctags ("R", has_tabs))
ex <- pkgstats:::external_call_network (tags_r, path, pkg_name)
ex <- ex [which (!ex$package == "base"), ]
nrow (ex)
#> [1] 495

# New tree-sitter parsing from code in #62:
tags_treesitter <- get_treesitter_tags (path)
ex <- pkgstats:::external_call_network (tags_treesitter, path, pkg_name)
ex <- ex [which (!ex$package == "base"), ]
nrow (ex)
#> [1] 395

ex <- data.frame (checkglobals::check_pkg (path))
ex <- ex [which (!ex$package == "base"), ]
nrow (ex)
#> [1] 159

Created on 2024-09-13 with reprex v2.1.1

That suggests current approach here is actually okay. The tree-sitter values may be presumed to provide a "true" reference. I'll investigate further over in #62

pawelru commented 2 months ago

Thank you for follow ups on this and apologies for a delayed answer. I was on a short break and right afterwards had to focus on something different. Now I am back into this and it looking very very promising. Thanks again. Even though the totals are somewhat similar I actually have found some differences. Please find below some top picks. For each of these, I have found treesitter to be closer to the truth.

Please find below a few examples that I was able to check quickly. I'm using the updated local copy of a referenced repo so the overall numbers might be slightly different to yours but the difference is rather minimal (e.g. I have 511 and you got 495). I'm yet to check it against checkglobals. This was my starting point - to check the top picks

r$> ex_r |> group_by(package, call) |> summarise(N = n()) |> arrange(-N)
`summarise()` has grouped output by 'package'. You can override using the `.groups` argument.
# A tibble: 269 × 3
# Groups:   package [22]
   package call        N
   <chr>   <chr>   <int>
 1 base    list       69
 2 base    c          67
 3 utils   data       56
 4 teal    modules    51
 5 base    lapply     36
 6 base    names      27
 7 base    args       25
 8 base    sprintf    21
 9 base    class      20
10 base    if         20
# ℹ 259 more rows
# ℹ Use `print(n = ...)` to see more rows
r$> ex_t |> group_by(package, call) |> summarise(N = n()) |> arrange(-N)
`summarise()` has grouped output by 'package'. You can override using the `.groups` argument.
# A tibble: 343 × 3
# Groups:   package [26]
   package call          N
   <chr>   <chr>     <int>
 1 base    c           103
 2 base    sprintf      73
 3 base    length       63
 4 base    list         62
 5 logger  log_debug    57
 6 base    names        55
 7 base    inherits     48
 8 base    lapply       47
 9 base    is.null      41
10 base    attr         36
# ℹ 333 more rows
# ℹ Use `print(n = ...)` to see more rows

These are just top diffs that I checked. Even though the sum is comparable - there are actually some differences - both positive and negative. Now I don't want you to go into the details - that's not the point of my message. What I just wanted to say here is that based on that brief analysis, the treesitter approach looks very very promising!

mpadge commented 1 month ago

Thanks @pawelru for pursuing this further. I've implemented a full treesitter function tagger in currently experiment package, pkgsimil (in even more efficient form). You just need to install that package, and run:

pkgsimil::pkgsimil_tag_fns(path)

Just note that things are likely to change quickly in that package, including function names and potentially name of package. But the underlying algorithm works really well, and identifies every single function call.

That will eventually find it's way over here, but doing so will require rebuilding our entire CRAN database to update function call data. So that will have to wait a while ...