ropensci-review-tools / pkgstats

Historical statistics of every R package ever
https://docs.ropensci.org/pkgstats/
17 stars 1 forks source link

Ditch cloc and use new C++ whitespace fn #8

Closed mpadge closed 3 years ago

mpadge commented 3 years ago

The whitespace fn effectively does the same thing, plus a bit more, and is way faster:

path <- here::here () # in a package
bench::mark (s1 <- cloc_stats (path),
             s2 <- whitespace_stats (path),
             check = FALSE)
#> # A tibble: 2 x 6
#>   expression                        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 s1 <- cloc_stats(path)       410.53ms 415.19ms      2.41        NA     0   
#> 2 s2 <- whitespace_stats(path)   3.77ms   3.83ms    251.          NA     2.07

Created on 2021-06-11 by the reprex package (v2.0.0.9000)

Then don't have to worry about non-CRAN status of cloc, so this package could go straight to CRAN :rocket:

mpadge commented 3 years ago

cloc uses this language definition-by-extension scheme, and identifies comments via the schema that starts here.

mpadge commented 3 years ago

TODO: Replace internal R/file-types-dict.R with this JSON equivalent from tokei, of this almost equivalent one from scc.