R package validation framework approch

pharmaR / regulatory-r-repo-wg

Package consensus for regulated industries

https://pharmar.github.io/regulatory-r-repo-wg

27 stars 3 forks source link

R package validation framework approch #29

Open kkmann opened 1 year ago

kkmann commented 1 year ago

I found this approach interesting as well for creating documented evidence for individual packages https://cloud.rstudio.com/resources/rstudioglobal-2021/r-package-validation-framework/.

Repo looks a bit dead though: https://github.com/phuse-org/valtools

Tagging @thebioengineer in case he is interested in sharing some experiences.

borgmaan commented 1 year ago

I helped build an internal system quite similar to valtools that leveraged many of the key concepts from the R Package Validation Group's whitepaper. Automatically generated reports that compile dev environment details, software requirements, tests cases & scripts, and execution results were really useful for internal package validation.

I think the requirements & matched tests will be the tricky piece to scale this out to a broad set of externally developed packages.

I agree some mechanism for auto-generating evidence is ideal.

This is another approach I have seen to auto-generating reports. It looks like @dgkf is a contributor there and may have some additional insights (especially around the req/test linking and their work with covtracer).

dgkf commented 1 year ago

I'm always happy to talk about covtracer :smile: But before we get to that, I think this is one consideration where it's especially important to make sure that we're addressing a regulator-driven need.

I have a feeling that the traceability matrix may be a holdover from processes that were used historically for quality assurance of these bespoke in-house analytic systems. If we were to migrate to publicly available packages that are much easier to inspect, I wonder if this need should still exist. In it's most minimalist form, perhaps it's fine if we just intuitively know that "behaving as described" is the requirement of any R package, and that adequate testing is needed.

It would be great to get some regulator insight into how they see the value of such documentation, especially if it comes with high computational and maintenance overhead.

thebioengineer commented 1 year ago

Hey All, happy to talk about package validation approaches, etc. I wrote valtools while I was at my previous employer, to support the approach we were taking there and the R Package Validation White Paper. I have since moved on, but the ideas behind it I think are still relevant in the sense of providing evidence that the package is behaving as expected should be the approach and why a package should be trusted. How you go about doing that, and to what level is based on what your organization has said they will do to generate the proof of trust.

kkmann commented 1 year ago

Love the work around covtracer and thevalidatoR although the core issue is audience: I look at this through a statistician's eyes and I am not so much interested in unittesting but structural testing. If we want a human element in this, we need to make it at least bearable if not enjoyable to read structural testing documentation. I am thinking along the lines of unittesting coverage + manual inspection of subset of unittests (basic QC to avoid cov hacking) + structural testing more akin to vignettes with testthat expectations (vignette-first approach). If I understand it correctly, the valtools approach is to add metainformation to testthat expectations and render reports from that (test-first approach). @thebioengineer did you ever get in touch with the ppl behind testthat to explore options of adding these annotation features in testthat? Also tagging @fpahlke (rpact) and @JimRogersMetrum.

kkmann commented 1 year ago

another great resource https://youtu.be/JkKJojVYBXM

kkmann commented 1 year ago

and https://resources.rstudio.com/resources/rstudioglobal-2021/monitoring-health-and-impact-of-open-source-projects/