pharmaR / regulatory-r-repo-wg

Package consensus for regulated industries
https://pharmar.github.io/regulatory-r-repo-wg
27 stars 3 forks source link

Technical framework #28

Open kkmann opened 1 year ago

kkmann commented 1 year ago

Dear all,

wanted to open a space to consolidate the discussion on the technical framework a bit.

  1. This repo's name suggests that some sort of 'repository' is the aim. I worry about this being a fairly complex solution to the problem of providing evidence of the adequateness of (statistical) packages. This solution would also to some extent depend on results from the R consortium repo WG since we would likely want to implement AT LEAST CRAN standards. If we want to go down this route, https://ropensci.org/r-universe/ is a very interesting initiative an we might want to connect with them
  2. We could also try to go for a more or less 'severless' approach by just curating a list of R packages (in pak or renv format?) with associated validation reports (sources and build via gh actions) in a simple git repository. I implemented something like this for a single package ages ago here: https://github.com/kkmann/adoptr-validation-report (don't judge x) )
  3. Does it make sense to think about integration with posit pkg manager already? (especially pinned curated CRAN sources https://docs.posit.co/rspm/admin/repositories/#sources, https://docs.posit.co/rspm/admin/appendix/source-details/#curated-cran-source); maybe running an instance of rspm could be another way of providing a CRAN like repo.
  4. Another way to go is a really tightly integrated service like BioConductor.
  5. Comments on validR https://www.mango-solutions.com/products/validr/
borgmaan commented 1 year ago

Building on the spirit of 5) above, another commercial solution that may warrant comments/comparison is MPN Pro which includes

Comprehensive package documentation on approved package sets within the Metworx environment aligned with a release schedule approved by client.

Metrum Package Network

dgkf commented 1 year ago

This repo's name suggests that some sort of 'repository' is the aim. I worry about this being a fairly complex solution [...] We could also try to go for a more or less 'severless' approach by just curating a list of R packages

I think this will be a good conversation for when we meet with the R Validation Hub Exec.

Regarding a curated cohort of packages, I think that this approach has historically hit a few roadblocks.

  1. Each company might have their own thresholds for quality, which may depend on the profile/phase/risk of a study.
  2. You quickly end up in a situation where you need to endorse a style of programming (eg, the "tidyverse" way vs the "base" way is a classic example, though decisions would need to be made any time there are competing implementations or packages that overlap with slightly different use cases).
  3. It's hard to avoid appearances of favoritism (eg, I think there are 4 or 5 different packages for making submission tables, many as a product of an individual pharma org)

Not to say it can't be a first POC, but those hurdles are why the project launched with the anticipation of providing a something that is less opinionated.

RE: Repo services (ppm, MPN, R-Universe)

I think we could leverage any of these to start. My inclination is toward R Universe - especially for prototyping just because it makes it easy to separate assessment and distribution.

Once we have a POC where we feel we have adequately defined how we want to communicate quality, then I think next steps would be to work with the R Consortium Repositories WG to understand whether these capabilities are in scope for what they're trying to deliver.

kkmann commented 1 year ago

@dgkf, interesting. How can it be harder to agree on a list of packages than provisioning an entire repository? To have any chance of solving the "validation issue" the repository would also have to have entry hurdles of some sorts, hence by definition it is also a curated list of packages (just with infrastructure to provision them). Unless it is only about providing a 'risk score' that each user then thresholds on. But that would again lead to a very fractured approach and the problem of setting said threshold. Seems very close to the approach initially proposed by riskmetric. In that case, I would again wonder why a repository is necessary - one would only need an API like metacran to query metrics and use that in, e.g., a custom rspm setup to filter CRAN. My main concern with such 100% technical/automated approaches is that they are not really guaranteeing a high degree of confidence in the correctness of more intricate stats packages. Everyone can get 100% unittesting coverage but that does not mean the results are correct.

I would hope that we can exclude as much "style"-related issues as possible and also allow competing solutions to the same problem. My understanding is that we want to make it a breeze for regulated companies to demonstrate quality of packages in accordance with the reg requirements. As long as there is someone willing to peer review (assuming that end up in the entry criteria), it does not hurt to have 10 table generation frameworks or multiple mmrm packages in the collection/repo. I would avoid intentionally opinionating this (makes sense for pharmaverse, but not so much here). Selection would rather happen via the (time) costs of passing the entry hurdles and thus indirectly interest.

borgmaan commented 1 year ago

My understanding is that we want to make it a breeze for regulated companies to demonstrate quality of packages in accordance with the reg requirements. As long as there is someone willing to peer review (assuming that end up in the entry criteria), it does not hurt to have 10 table generation frameworks or multiple mmrm packages in the collection/repo.

I think this touches on many points that I align with. I also think it brings to the forefront a key decision -- should the process include a human-in-the loop peer review? As @kkmann mentions, passing a series of checks does not indicate accuracy. I have previously heard this discussed as the difference between qualification and validation (apologies if this is off... I am still new to this area).

Are we hoping to provide a qualified bundle of packages that pass a series of quality-related automated tests/checks? Or are we hoping to also make some claims about the accuracy of those packages? Or somewhere in between/both? I think any and all are good goals, but what is our focus here?

dgkf commented 1 year ago

We might be putting the horse before the cart here. Ultimately I don't think we should be deciding, for example, whether human-in-the-middle peer review of statistical methods is a necessary precondition.

Instead of speculating about what the right approach is, I think we should be considering how we get answers to these questions.

I think any and all are good goals, but what is our focus here?

Is the goal different across these suggestions? As I'm reading it, the goal feels rather uniform - to provide trusted software for R submissions.

It might feel disjointed because "trust" is a hard thing to nail down, and we might have different opinions about what and how evidence of trust is provided. With that in mind, I see the first actionable steps as being to engage health authorities with a few of of these concepts to understand what they're looking for. Let's not get too bogged down in the "how" until we have a better read on the "what" and the "why".

kkmann commented 1 year ago

First of all, great that we are getting the discussion going :)

To me, it would be very much in scope to come up with a minimal consensus on what most stakeholders (incl. reg) deem necessary to establish quality. Ofc, everyone is free to add on top of that. If we can come up with a completely automated way of satisfying the requirments - great, if not, also ok. In order to get the ball rolling, it might still be a good idea to work out a few concepts and highlight their respective trade-offs - exactly to keep the solution space wide in the beginning.

yannfeat commented 1 year ago

It might feel disjointed because "trust" is a hard thing to nail down, and we might have different opinions about what and how evidence of trust is provided. With that in mind, I see the first actionable steps as being to engage health authorities with a few of of these concepts to understand what they're looking for. Let's not get too bogged down in the "how" until we have a better read on the "what" and the "why".

Regarding the "what", if I have correctly understood our visions, they all involve defining a framework to validate R packages at some point. The definition of our technical framework could thus start with the definition of a set of rules to satisfy different levels of validation (unit testing, integration testing, system testing, acceptance testing), depending on the purpose of the package (eg. producing regulatory tables, reporting adverse events...) and defined constraints (CDISC compliant? confirmatory study? ...)?

To me, it would be very much in scope to come up with a minimal consensus on what most stakeholders (incl. reg) deem necessary to establish quality. Ofc, everyone is free to add on top of that. If we can come up with a completely automated way of satisfying the requirments - great, if not, also ok. In order to get the ball rolling, it might still be a good idea to work out a few concepts and highlight their respective trade-offs - exactly to keep the solution space wide in the beginning.

Our goal would be to facilitate each level of validation as much as possible for sponsors, with the acceptance from the regulatory agency as a main focus (the "why"). The framework could be written with this in mind. Then we would decide on ways to answer this question, whether it is a CRAN-like platform, a CI/CD methodology...