Closed maelle closed 3 years ago
also when there is no dependency of a given type could it be a NULL instead of a NA?
Short answer: No and no. These things have to be returned as single character strings to (easily) enable data from multiple packages to be combined with simple rbind
. That could be done with I(as.list(...))
, but (1) gets messy; and (2) with 120,000 rows and now > 90 columns, the increase in memory footprint would be considerable. For the same reason of rbind
-ability, no values can be empty (NULL
or character(0)
), but have to have non-zero size, hence NA
is necessary. Does that make sense?
yep, so this parsing should happen in pkgcheck, correct?
Yes, correct. Another reason they're like that is because it's the direct output of data.frame(read.dcf("DESCRIPTION"))$Imports)
, and pkgstats
tries to get information out as efficiently as possible, with no unnecessary post-processing. This is the kind of price such post-processing has to pay:
bench::mark (
read.dcf("DESCRIPTION")[1,"Imports"],
data.frame(read.dcf("DESCRIPTION"))$Imports,
desc::desc_get_deps("."),
check = FALSE)
#> # A tibble: 3 × 6
#> expression min median `itr/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl>
#> 1 read.dcf("DESCRIPTION")[1, "Imports"] 245.75µs 272.87µs 3513.
#> 2 data.frame(read.dcf("DESCRIPTION"))$Imports 365.14µs 402.42µs 2269.
#> 3 desc::desc_get_deps(".") 5.43ms 6.78ms 134.
#> # … with 2 more variables: mem_alloc <bch:byt>, gc/sec <dbl>
Created on 2021-09-24 by the reprex package (v2.0.1.9000)
Extracting in current form is over 1,000 times more efficient!
Moved to pkgcheck
because dependencies can indeed be parsed here. The main object returned by pkgcheck
currently does not have a list of dependencies, but that would be both good and important to include. @maelle My suggestion will be to add an extra item to the main pgkcheck()
output called "description"
, that has parsed values of the main pkgstats()$desc
output:
library (pkgstats)
path <- "/<path>/<to>/lme4_1.1-26.tar.gz"
s <- pkgstats (path)
s$desc
#> package version date license urls
#> 1 lme4 1.1-26 2020-12-01 00:50:10 GPL (>= 2) https://github.com/lme4/lme4/
#> bugs aut ctb fnd rev ths trl depends
#> 1 https://github.com/lme4/lme4/issues 4 9 0 0 0 0 Matrix
#> imports
#> 1 graphics, grid, splines, utils, parallel, MASS, lattice, boot, nlme, minqa, nloptr, statmod
#> suggests
#> 1 knitr, rmarkdown, PKPDmodels, MEMSS, testthat, ggplot2, mlmRev, optimx, gamm4, pbkrtest, HSAUR3, numDeriv, car, dfoptim, mgcv
#> linking_to
#> 1 Rcpp, RcppEigen
Created on 2021-09-28 by the reprex package (v2.0.1.9000)
That info is not part of any actual check, but is pretty important meta-information that can and should be stored with the pkgcheck
output regardless.
Or it could be a desc object? (not sure it's a good suggestion at all).
pkgcheck()$desc
now gives the same result as desc::desc_get_deps()
, but without the version numbers, so just 2 columns of type
and package
. This information is ignored in summary
and print
methods, but is in the return object for anyone who wants access to it.
I.e. each column would be not a string but instead a list to make it easier to retrieve the vector of deps.