Closed andyofsmeg closed 3 years ago
Copying over my reply from our email exchange (with some slight edits for further detail):
This is something we're aware of and working on. It's really only ever an issue when you're assessing risk on a remote package (eg on CRAN, without the source code available or an installed version of the package as the focus of the assessment).
The short answer is that we'll probably have different ways of aggregating results across operating systems which might give an indication of OS-agnosticism of the package as a criteria for robustness. This is effectively a separate metric ("Does R CMD check pass on my OS?" vs "Is R CMD check expected to pass on arbitrary OS?")
The long answer is that you want to be more concrete when assessing risk. Although we provide the option to assess risk based only on info from the web, it's not ideal. Instead, it's preferred to install the package on similar OS/hardware to perform the assessment (better still if you have the full source code available), this way you are sure that you're assessing risk as it's relevant given the OS/hardware that you're intending to use. In this situation, this concern disappears since you know precisely the software & hardware stack that is relevant and riskmetric will (soon) be able to run the full R CMD check as part of the assessment, capturing its output and using that to construct more fine-grained metrics for evaluation.
See @emilliman5's recent work and related discussions around this exact issue in #137
Correct, PR #170 addresses this request, for CRAN and BioC remotes we tally the Error, Warns, OK, Notes across each repository's set of OS flavors (11 for CRAN and 3 for BioC). Furthermore, to Doug's second point, we have also implemented (#169) a local R CMD check assessment, this time tallying the number of Errors, warnings, Notes from devtools::check
R CMD checks metric PRs have been merged into master branch so I think we can close this issue
When a package is on CRAN it can be in different state of status depending on operating system, ie passing on some, warning on others and failing on a few. Is there a way of capturing this in the risk score.
Question posed at R/Pharma by Jonathan Sidi