pharmaR / riskmetric

Metrics to evaluate the risk of R packages
https://pharmar.github.io/riskmetric/
Other
159 stars 30 forks source link

pkg_score() results differs for pkg_install() and pkg_source() #315

Open paulie-of-punskas opened 12 months ago

paulie-of-punskas commented 12 months ago

Hello. I have noticed, that running pkg_score() returns different results, when run for pkg_source() and pkg_install(). I tested them with riskmetric 0.2.3, on askpass 1.1, dplyr 1.0.5 and openssl 1.4.3 packages. Shouldn't the results be equal? If not, which reference should be used for assessing package risk?

Reproducible code:

# === load riskmetric library
library("riskmetric")
library("magrittr")

# === create testing environment
dir.create(paste0(tempdir(), "/test_riskmetric"))
dir.create(paste0(tempdir(), "/test_riskmetric/source"))
dir.create(paste0(tempdir(), "/test_riskmetric/library"))
dir.create(paste0(tempdir(), "/test_riskmetric/downloads"))

# === download and unpack files
pkgs <- c("askpass", "dplyr", "openssl")
download.packages(pkgs, destdir = paste0(tempdir(), "/test_riskmetric/downloads"))

lapply(list.files(paste0(tempdir(), "/test_riskmetric/downloads"), full.names = TRUE),
       untar, 
       exdir = paste0(tempdir(), "/test_riskmetric/source"))

# === install packages
install.packages(pkgs, lib = paste0(tempdir(), "/test_riskmetric/library"))

# === get scores
dplyr_src <- pkg_ref(x = paste0(tempdir(), "/test_riskmetric/source/dplyr")) %>% 
  pkg_assess(assessments = riskmetric::all_assessments()) %>% 
  pkg_score() %>%
  unlist()

dplyr_lib <- pkg_ref(x = "dplyr", source = "pkg_install", lib.loc = paste0(tempdir(), "/test_riskmetric/library")) %>% 
  pkg_assess(assessments = riskmetric::all_assessments()) %>% 
  pkg_score() %>% 
  unlist()

Below you can see the differences in metrics: image

Thanks and greetings.

emilliman5 commented 11 months ago

The results should not be equal. As to which to use, that is up to you and your use case.

1) Not all assessments/metrics are available for all ref sources, this is by design, as not all info is available for all sources (e.g. unit tests are not available for installed packages so there is no way to run code coverage for an installed package). That said we are working toward implementing as many assessments/metrics for as many sources as possible as we mature the package. We are evening discussing/designing chaining source together to create as complete a score as possible.

2) There are small discrepancies in scores when computing from different sources. So far these have been negligible so we have back logged this issue for now. Between source code and installation there are some things R does to "compile" the package that I haven't yet fully investigated.