Closed assignUser closed 2 years ago
Checking code duplication is in principle a good idea, but it is definitely something that can only safely be done through subjective personal reviews. This package is a really good example - it is full of functions which all follow the same basic template, yet which all have to be independently implemented to provide the individual plug-in tests. Running {dupree}
gives this:
setwd ("/data/mega/code/repos/ropensci-review-tools/pkgcheck")
library (dupree)
dupree_package ()
#> # A tibble: 48 × 7
#> file_a file_b block_a block_b line_a line_b score
#> <chr> <chr> <int> <int> <int> <int> <dbl>
#> 1 ./R/github.R ./R/github… 21 34 99 128 0.756
#> 2 ./R/pkgcheck-methods.R ./R/summar… 19 14 293 63 0.689
#> 3 ./R/pkgcheck-methods.R ./R/pkgche… 17 18 224 252 0.590
#> 4 ./R/check-covr.R ./R/check-… 1 7 2 29 0.484
#> 5 ./R/checks-goodpractice.R ./R/checks… 42 44 277 393 0.474
#> 6 ./R/pkgcheck-fn.R ./R/summar… 34 14 215 63 0.453
#> 7 ./R/check-pkgname-available.R ./R/checks… 7 14 29 96 0.420
#> 8 ./R/check-fns-have-exs.R ./R/checks… 9 14 46 96 0.410
#> 9 ./R/check-fns-have-exs.R ./R/check-… 9 16 46 29 0.403
#> 10 ./R/checks-goodpractice.R ./R/checks… 43 44 312 393 0.382
#> # … with 38 more rows
Created on 2022-01-21 by the reprex package (v2.0.1.9000)
The first result has over 70% duplication for these two lines: https://github.com/ropensci-review-tools/pkgcheck/blob/60958ce33dd82f4a8a69229afd5160304599d3d5/R/github.R#L99 and https://github.com/ropensci-review-tools/pkgcheck/blob/60958ce33dd82f4a8a69229afd5160304599d3d5/R/github.R#L128 Any way of deriving whole-back metrics from those scores would rate this package very high, yet that would in this case be expected, and is perfectly okay. Code duplication always requires careful, subjective judgement. But thanks for the suggestion regardless
{dupree} works quite well, though I am not sure if it is still actively maintained.