A snapshot-based model for production

wlandau commented 4 months ago

In a meeting with @shikokuchuo and @jeroen today, @jeroen had a breakthrough model for production in R-multiverse. Here's how it goes:

Community

We keep the existing community universe at https://community.r-multiverse.org, where the latest releases of packages are guaranteed to be available. Uses "branch": "*release" in packages.json.

Staging

We have a second universe at https://staging.r-multiverse.org where packages are staged for production but not necessarily in production yet. Staging uses "branch": SPECIFIC_REMOTE_SHA in packages.json to select specific versions of packages. Packages are automatically promoted from production to community based on lightweight automated checks on the package metadata, e.g. the DESCRIPTION file has no Remotes: packages and the version numbers of releases increase monotonically.

Production

Production is not its own universe. Rather, it is a snapshot of the Staging universe. Quarterly snapshots of Staging will be downloaded from the R-universe snapshot API at https://staging.r-multiverse.org/apis and hosted from Netlify. These snapshots will include the sources and binaries of all the packages which pass R CMD check on R-release and R-oldrelease in the Staging universe. Users will be able to download packages from the latest snapshot using install.packages(repos = "https://production.r-multiverse.org"). Past snapshots will be archived in GitHub Container Registry.

Phases

Since production will only update quarterly, maintainers will get few chances get packages into production. To increase the chance of success, we will institute a "freeze" phase a month before each snapshot gets published. During the freeze, R-multiverse automation will avoid promoting new package versions to Staging except for packages which are failing checks. In other words, packages that are already healthy are frozen. Maintainers who really need to update them anyway will be able to manually submit pull requests to https://github.com/r-multiverse/staging to update the "branch" field of packages.json.

Remarks

In each production snapshot, we should also include packages whose checks are failing due to false positives. Maybe we retry failing checks during the freeze. @jeroen, is this possible?
The snapshot API URL could get quite long: e.g. https://r-multiverse-staging.r-universe.dev/api/snapshot/zip?packages=crew,crew.cluster,mirai,nanonext,targets,... @jeroen, is there a way to specify packages to exclude instead of ones to include?

Implementation

@shikokuchuo, IIRC you volunteered to investigate production snapshots. Thanks for that.
For now, I will implement the following in multiverse.internals:
1. Quarterly freezes on Staging.
2. New functions snapshot_include() and snapshot_include() to list packages that should be included/excluded in the production snapshot.

shikokuchuo commented 4 months ago

These snapshots will include the sources and binaries of all the packages which pass R CMD check on R-release and R-oldrelease in the Staging universe.

It gets a bit tricky if we consider reverse depends. If a strong dependency of a passing package is failing checks, I assume we still allow the passing package to enter Production? The justification would be that we allow CRAN dependencies (i.e. we are not replicating the entire dependency tree in Multiverse), and those would not be checked anyway. The failing package would also still be in Community.

wlandau commented 4 months ago

It gets a bit tricky if we consider reverse depends. If a strong dependency of a passing package is failing checks, I assume we still allow the passing package to enter Production?

I was actually thinking the opposite: require all dependencies to pass. (NB multiverse.internals already records an issue if an upstream dependency fails.) At first glance, anything less would seem to weaken guarantees for users.

Some sort of compromise could be helpful, but it's not obvious what would work. If we allowed packages whose dependencies fail, then we would have to get those packages from community, staging, or the previous snapshot, which could cause new bugs or incompatibilities.

Maybe instead we could approach this problem by making it as easy as possible to pass checks in staging. After all, the ideal case is to have as few failures as possible. This could mean retrying failed checks during the freeze, making it easy to reproduce checks in development, etc.

The justification would be that we allow CRAN dependencies (i.e. we are not replicating the entire dependency tree in Multiverse)

CRAN is a different model, but I think it's arguably still production.

The failing package would also still be in Community.

This would also be part of the compromise if we go with the strict route and remove packages whose dependencies fail.

shikokuchuo commented 4 months ago

I'm seeing 2 potential directions:

1. Complete distribution

All dependencies are pulled. This can be automated from the CRAN / Bioc mirror. A contribution PR would override any existing packages pulling from the mirror.

The snapshot image would then be self contained and no external repositories would be set when installing.

This is probably what it takes to make it truly useful as a ‘validated’ snapshot.

2. Collection of packages

Packages that pass checks. Installing packages would depend on CRAN / BioC [and Community].

The following is on revdeps is optional, we could take a strict view instead, but it seems most consistent with the 'listing of packages' approach:

[ On revdeps: checks will inevitably differ between Multiverse and CRAN, sometimes just because of the grace periods available on CRAN, or because certain packages are ‘too big to fail’ on CRAN. A package can fail checks in staging and remain on CRAN. There is then this fundamental inconsistency if we remove such packages, just because they exist on Multiverse.

Instead it is possible not to remove revdeps as long as they pass their own checks. Installing these packages will then require installing the failing dependency from Community. The fact that such a dependency was pulled from Community can be tracked via the installed package metadata.

We could include something in our multitools package to make this easy to check. Perhaps this could ~~even be a library() drop-in that~~ just checks that something is from Production ~~before loading it~~. This would then be consistent with say a Debian model where installing a system package can install a bunch of depdendencies, but the package you request to install is marked 'manual' and the others 'automatic'. ]

On the other hand, if taking a strict approach, the key risk here is availability - packages moving in and out of Production too frequently can provide a poor experience for both users and maintainers. We'd have to spend time thinking on policies, and this seems likely to involve more manual intervention to create each release.

We should spend some time thinking on these (or variants).

(1) seems to be the natural conclusion of where to take things but I wonder if @jeroen was also thinking down these lines. This would automatically lead to the repo getting bigger again and R-universe would need to be supportive.

wlandau commented 4 months ago

The snapshot image would then be self contained and no external repositories would be set when installing.

We can assume in most cases that users will just get non-R-multiverse dependencies from CRAN. I think this is fine because it agrees with how testing already works in Staging. In other words, even though they really have a hybrid of CRAN and R-multiverse packages, users should get the same experience we tested for.

I don't think snapshots need to be complete, and I don't think they should try to be self-contained. It is normal to have an environment with multiple package repositories, even in a validated/qualified environment in a highly regulated industry.

This is probably what it takes to make it truly useful as a ‘validated’ snapshot.

https://www.pharmar.org is worth a look here, particularly https://www.pharmar.org/regulations/ and https://www.pharmar.org/white-paper/. FDA's Glossary of Computer System Software Development Terminology apparently says:

Validation: Establishing documented evidence which provides a high degree of assurance (accuracy) that a specific process consistently (reproducibility) produces a product meeting its predetermined specifications (traceability) and quality attributes.

And the ICH E9 guidance on Statistical Principles for Clinical Trials says this about "Integrity of Data and Computer Software Validity":

The computer software used for data management and statistical analysis should be reliable, and documentation of appropriate software testing procedures should be available.

For us, this sounds much more like being able to trust the packages we do include, rather than trying to be include everything.

Instead it is possible not to remove revdeps as long as they pass their own checks. Installing these packages will then require installing the failing dependency from Community. The fact that such a dependency was pulled from Community can be tracked via the installed package metadata.

As you have noted before, if a revdep succeeds and a dependency fails, we should assume that the revdep is avoiding all the broken features of that dependency. In an ideal world, we should let the revdep use the dependency, but block the user from doing so. Maybe your proposed alternative to library() in multitools approximates this. On the other hand, if we did that, then we start to get into install_safe() territory from #6, which we abandoned mostly because it does not easily plug into the workflow of a naive user.

shikokuchuo commented 4 months ago

I don't think snapshots need to be complete, and I don't think they should try to be self-contained. It is normal to have an environment with multiple package repositories, even in a validated/qualified environment in a highly regulated industry.

To respond to this first point, we definitely can make use of multiple package repositories and we shouldn't need to physically bundle everything in a Multiverse distribution. But in that case, we need to make it very transparent how to retrieve the actual versions of packages used from the CRAN / Bioc mirror for the checks for any particular Multiverse package. Not needed for Multiverse packages, as they will simply be part of the release. This would then seem to satisfy the 'reproducibility' criteria of validation.

wlandau commented 4 months ago

I am rethinking what I said in https://github.com/r-multiverse/help/issues/78#issuecomment-2252725776.

I still don't think it makes sense to include all dependencies in the snapshot, e.g. ones from CRAN. As we said early on, it's not realistic to capture the full dependency tree of every package all the way down to core packages like Matrix. So we're always going to be in a situation where dependencies are left out of snapshots and entrusted to other repositories. In this sense, if we keep revdeps whose dependencies fail in Staging, then it won't be so different from where R-multiverse already sits in the package ecosystem. So I'm starting to think we can keep those revdeps and maybe multitools can help users understand where their packages come from.

wlandau commented 4 months ago

On the other hand, maybe omitting failing dependencies from Staging is different from conditioning on CRAN: sometimes Staging packages won't be on CRAN. Need more time to consider.

shikokuchuo commented 4 months ago

I need more time to come to a conclusion about revdeps. But in the Validation hub paper:

One of the core concepts presented in this paper is that Imports are not typically loaded by users and need not therefore be directly risk-assessed.

I think this is a similar concept to what I was talking about. The assumption is that those dependencies pulled from the Community universe are only imports and not directly loaded by a user.

But I want to be 100% clear that the multitools functionality I proposed in this regard is limited to just checking if a package was installed from community.r-multiverse.org (it's in the package metadata anyway) - and nothing more than this. I take back the suggestion about a library() drop-in (amended in the original comment) as I don't believe it is a helpful direction and it can lead to confusion about our intentions.

shikokuchuo commented 4 months ago

On the revdeps, I think it's really a matter of interpretation and both are valid from some perspectives.

From a consistency perspective, if a package is failing in staging, yet available on CRAN, it seems like as it can be pulled from CRAN (Multiverse is not self-contained), then therefore maybe it should be allowed to. If a user tries to install a package, then this is exactly how it is resolved (the failing dependency is pulled from CRAN).
However, we actually have more information than CRAN, in that it fails Multiverse checks. Perhaps it depends on a package that has been updated in Multiverse, but not (yet) on CRAN. In that sense, it makes sense to use our checks for consistency within Multiverse rather than just falling back to CRAN just because it is available there (although it is available there and an install would work).

If we go with the latter option, will the current (or envisaged) issues file be sufficient for us to say just exclude all of these from the release? If I understand correctly, this is created when any dependency is failing. If we can avoid shenanigans such as actually needing to remove packages from staging to re-run checks etc. that would make things much easier.

wlandau commented 4 months ago

However, we actually have more information than CRAN, in that it fails Multiverse checks. Perhaps it depends on a package that has been updated in Multiverse, but not (yet) on CRAN. In that sense, it makes sense to use our checks for consistency within Multiverse rather than just falling back to CRAN just because it is available there (although it is available there and an install would work).

Yes, this consistency will increase how much users can trust production snapshots, and I think it will be extremely valuable. If the recommendation is to prefer R-multiverse when a package is also on CRAN, it will be much simpler for users to navigate to something that "just works" (TM). Like Gabe has said, most users need this, and they will not have the time or expertise to understand compatibility among packages.

If we go with the latter option, will the current (or envisaged) issues file be sufficient for us to say just exclude all of these [revdeps] from the release?

Yes, if a package fails in either universe, then currently this generates an issue in all of its downstream revdeps. An example is https://github.com/r-multiverse/community/blob/main/issues/tidypolars, which has "dependencies": {"polars": []} shows that polars is failing its checks. This functionality was introduced in https://github.com/r-multiverse/multiverse.internals/pull/26, and it lives in multiverse.internals::issues_dependencies(). So the latter option would happen automatically at this point unless we decide to avoid or ignore issues_dependencies(). Whether we keep revdeps or remove them from production snapshots, the implementation is super easy at this point.

If we can avoid shenanigans such as actually needing to remove packages from staging to re-run checks etc. that would make things much easier.

One of the best parts about all this is that we don't need to remove packages from staging. In fact, the only checks we currently impose from community to staging are ad hoc checks on the DESCRIPTION metadata, e.g. good version numbers and no Remotes:.

shikokuchuo commented 4 months ago

Things will "just work" TM whichever way we approach it - either packages are not in R-multiverse production release in the first place, or they are and dependencies will be drawn from R-multiverse Production first and then CRAN. There is no ambiguity about how this works / could work - or let me know if I've missed something here.

This then raises the question - R-Multiverse production is now a point release, with specific versions guaranteed to work together (validated). This also applies to CRAN dependencies. If however, we operate a CRAN-like repo for the purposes of install.packages(), having CRAN as a fallback necessarily pulls in the latest version rather than the 'validated' version we have recorded. As authors of CRAN packages have no obligations to R-multiverse and we host packages not on CRAN, these could easily break packages in our production release. Btw. I am not even saying that we have to serve a CRAN-like Production repo necessarily - I think the snapshot idea opens up different ways of thinking about things for different target audiences and use cases.

So I think we shouldn't dismiss any of the points raised too quickly. Am open for discussion on this.

wlandau commented 4 months ago

Things will "just work" TM whichever way we approach it - either packages are not in R-multiverse production release in the first place, or they are and dependencies will be drawn from R-multiverse Production first and then CRAN. There is no ambiguity about how this works / could work - or let me know if I've missed something here.

When we include a package in production, we are claiming that tests pass using the dependency chain in Staging + CRAN. If we include a revdep but omit a dependency, users will need to go find that dependency somewhere. If they find a different version than we tested with in Staging, whether from Community or CRAN, this could break the revdep and the guarantee is lost.

If however, we operate a CRAN-like repo for the purposes of install.packages(), having CRAN as a fallback necessarily pulls in the latest version rather than the 'validated' version we have recorded. As authors of CRAN packages have no obligations to R-multiverse and we host packages not on CRAN, these could easily break packages in our production release.

Exactly what I am worried about if we include revdeps of failing dependencies in production.

shikokuchuo commented 4 months ago

Things will "just work" TM whichever way we approach it - either packages are not in R-multiverse production release in the first place, or they are and dependencies will be drawn from R-multiverse Production first and then CRAN. There is no ambiguity about how this works / could work - or let me know if I've missed something here.

When we include a package in production, we are claiming that tests pass using the dependency chain in Staging + CRAN. If we include a revdep but omit a dependency, users will need to go find that dependency somewhere. If they find a different version than we tested with in Staging, whether from Community or CRAN, this could break the revdep and the guarantee is lost.

Right, so there is only ambiguity if we include Community. If we don't do that, as it is not a production repository by definition, then we have Mutiverse-production, CRAN, Bioc in that order, there is only one way for things to resolve.

If however, we operate a CRAN-like repo for the purposes of install.packages(), having CRAN as a fallback necessarily pulls in the latest version rather than the 'validated' version we have recorded. As authors of CRAN packages have no obligations to R-multiverse and we host packages not on CRAN, these could easily break packages in our production release.

Exactly what I am worried about if we include revdeps of failing dependencies in production.

And not just for revdeps, but for all packages on Multiverse-production.

Stepping back, as revdeps is just a detail here. If we go back to the idea of Multiverse-production being a "listing of packages", then:

This listing (including all dependencies) of all package versions is valuable as a 'validated' snapshot.
We can then provide a CRAN-like repo production.r-multiverse.org for users to install packages from this snapshot. It is at this step that we can go in a number of approaches, from maximalist (store and fix all packages) to minimalist (only store packages on R-multiverse).

We need ideas for (2) above. Because, taking the minimalist route where we only store R-multiverse packages, dependency resolution will use CRAN. Take the case where a CRAN dependency is archived the next day after a Multiverse release. That means packages from Multiverse-production will just fail to install.

llrs commented 4 months ago

You could add cranhaven.org repo. The R repos would end up being production.r-multiverse.org, CRAN, Bioconductor, cranhaven; package installation will depend on filters from available.packages which picks the latest version available of a package in any repository.

CRANhaven protects against "sudden" missing dependency problem and gives some cushion until packages are back to CRAN (~50% of them go back, 50% of those in the first 30 days). This shouldn't happen with Bioconductor which only removes packages every 6 months (Around March and November) but it might happen too.

wlandau commented 4 months ago

Right, so there is only ambiguity if we include Community. If we don't do that, as it is not a production repository by definition, then we have Mutiverse-production, CRAN, Bioc in that order, there is only one way for things to resolve.

we can go in a number of approaches, from maximalist (store and fix all packages) to minimalist (only store packages on R-multiverse).

In discussing the possibility of keeping revdeps of failed dependencies, my underlying concern is how the testing environment in Staging may differ from the user's package environment locally. So I guess if we are only bundling R-multiverse Staging packages in the snapshot (which I still think is the pragmatic approach here) then we run that risk for many dependencies regardless of what we do with revdeps.

I wonder if transparency is an achievable middle ground. If we can't snapshot all dependencies, can we at least list them? Maybe accompanying each snapshot could be a metadata list of packages that failed in Staging, along with their versions and URLs. We could even consider a completely separate "Failures" snapshot (need to think of a friendlier name) to make these higher-risk packages available with the exact versions from Staging at the time of the Production snapshot. We could also include metadata lists of all the packages and versions that were on CRAN and Bioconductor at the time of the Production snapshot.

I think this would allow us to include revdeps of failing dependencies in Production. Production would not be perfect, but it would be fully up front and clear about the known risks. Users in highly-regulated high-stakes environments would then have the power to to be extremely careful about where they get their packages and which packages are approved for users to call directly.

shikokuchuo commented 4 months ago

Yes, so I think we're all agreed that bundling all dependencies it is not the preferred option if it can be avoided.

In terms of capturing the metadata of all package dependencies, I think this would be worthwhile.

But in terms of providing a production repository, a viable option would seem to be using Posit Public Package Manager (p3m) snapshots rather than CRAN mirrors. The tests from staging would use the latest daily snapshot, and this would cut-off at our release date. Then a user can perfectly reproduce by using Multiverse snapshot + p3m snapshot. See https://posit.co/blog/migrating-from-mran-to-posit-package-manager/

wlandau commented 4 months ago

Oh nice! Yeah, I guess P3M already gives us snapshots so we don't have to create any ourselves other than Production (and possibly a separate one with just packages that failed in staging, if you agree this would be useful).

wlandau commented 4 months ago

and possibly a separate one with just packages that failed in staging, if you agree this would be useful

Or, in the "Failures" snapshot, maybe we only include packages with revdeps in production?

shikokuchuo commented 3 months ago

You could add cranhaven.org repo.

@llrs thanks. As I understand it, cranhaven just copies from the R-universe CRAN mirror. As cranhaven only keeps packages around for [5] weeks, versus forever in R-universe, I don't see any advantage pointing to cranhaven.

In any case, we've moved on to p3m, so it's off-topic for this thread now, but feel free to open a discussion if you want to talk about cranhaven - I'm always open to ideas.

shikokuchuo commented 3 months ago

Oh nice! Yeah, I guess P3M already gives us snapshots so we don't have to create any ourselves other than Production (and possibly a separate one with just packages that failed in staging, if you agree this would be useful).

I'm super glad to have found this. I have some vague recollection of coming across this a long time ago...

For Production, I think we should be consistent and only include metadata for packages that actually make the cut.

We could produce similar metadata for Community, but I think everything should be made consistent with one of the two. Otherwise it's confusing even for us...

shikokuchuo commented 3 months ago

For revdeps I see the possibilities as either of:

Keep packages in Staging with an issues file -> revdeps are tested against Staging version -> all revdeps are removed for the Production cut (simply all packages with an issues file).
Actively remove packages from Staging as soon as they fails checks -> revdeps are tested against CRAN p3m version (if it exists) -> all packages in Staging make the Production cut on the cut-off date.

For both, checks for revdeps should be triggered when any package updates in Staging.

The second option may be a bit trickier to orchestrate, but has the advantage of not potentially having swathes of packages move in and out of Production for consecutive releases. Both work in the same way for users (offer the same 'user guarantees').

wlandau commented 3 months ago

For Production, I think we should be consistent and only include metadata for packages that actually make the cut. We could produce similar metadata for Community, but I think everything should be made consistent with one of the two. Otherwise it's confusing even for us...

I guess too much metadata could be confusing for novice users, even if it improves transparency for advanced users.

The second option may be a bit trickier to orchestrate

Even worse, if a package were to vanish from Staging as soon as an error happens, it would be extremely difficult for maintainers to fix issues because the evidence keeps disappearing.

I would much prefer (1): simple to implement, simple to orchestrate, simple for everyone to understand, and with the strongest guarantees on the quality of the package cohort in Production.

wlandau commented 3 months ago

Just so I understand: for p3m, would we have install.packages(repos = c("https://production.r-multiverse.org", "https://packagemanager.posit.co")) for Staging and recommend something like install.packages(repos = c("https://production.r-multiverse.org", "https://packagemanager.posit.co/SNAPSHOT-DATE")) to users?

shikokuchuo commented 3 months ago

I would much prefer (1): simple to implement, simple to orchestrate, simple for everyone to understand, and with the strongest guarantees on the quality of the package cohort in Production.

Ok let's go with that. We can see how well in practice it works in any case - I think that this level of operational decision won't prevent us from making changes down the road if it becomes necessary.

wlandau commented 3 months ago

Awesome!

As I have said before, I think the best way to prevent entire cohorts of packages from getting taken down is to prevent package failures from happening in the first place. The next phase of R-multiverse could focus on this.

wlandau commented 3 months ago

we have install.packages(repos = c("https://production.r-multiverse.org", "https://packagemanager.posit.co")) for Staging

Maybe that part isn't necessary since R-universe already pulls from CRAN.

shikokuchuo commented 3 months ago

we have install.packages(repos = c("https://production.r-multiverse.org", "https://packagemanager.posit.co")) for Staging

Maybe that part isn't necessary since R-universe already pulls from CRAN.

Only difference I can see is that p3m has a slightly delayed version of CRAN - it's a daily cron job I think. Also possibly on rare occasions it may fail to catch up as quickly. So to be 100% sure that what we test is consistent with the snapshot we recommend to users, it would be safer to use Posit. Or we can approximate by using a p3m snapshot of the day after the Multiverse cut-off.

shikokuchuo commented 3 months ago

So now we've decided how to treat revdeps, I think that could allow us to simplify things even further. See if the below works:

Run all checks from Community.
Version issues and check issues both produce issues files.
Manual job (quarterly) updates the Staging repo with fixed SHAs of all packages without issues.
If no issues with Staging repo, trigger another manual job which copies Staging over to Production.
R-universe snapshot is made of this and saved to an archive/container.
Users download from 'production.r-universe.org', which is the usual universe powered by the GitHub repo.

As our release is snapshot/archived, this protects against GitHub repos being removed. This is no different to CRAN which also allows archival at maintainer's request.

This has the advantage of not needing Netlify or another hosting provider for the production CRANlike repo. I've looked into this and I think it requires them to recognise us as a legitimate open source initiative to avoid the paid offerings (have to fill in a form). I have a concern about using services like this as a change in management / policy on their end can lead to the service becoming unavailable at short notice.

This way we use 'staging' as a test repo to avoid potential disruption of 'production'. Otherwise I think 'staging' is only really needed for the first model i.e. physically moving packages to/from Community to Staging in order to test dependencies.

EDIT: of course we don’t have to use ‘staging’, we can also deploy straight to production.

wlandau commented 3 months ago

So now we've decided how to treat revdeps, I think that could allow us to simplify things even further. See if the below works:

My main concern is the lack of a "freeze" period leading up to the snapshot. For a snapshot-based model, the freeze on non-failing packages is important to make sure we have a soft landing every quarter, and it is especially important now that we have decided to remove revdeps of failing packages. I don't think we would ever want to freeze Community.

I've looked into this and I think it requires them to recognise us as a legitimate open source initiative to avoid the paid offerings (have to fill in a form). I have a concern about using services like this as a change in management / policy on their end can lead to the service becoming unavailable at short notice.

Maybe this is where we could ask Posit for help? Or as a last resort, maybe a third universe? I know Jeroen was originally resistant to this, but maybe it wouldn't be such an extra burden in a snapshot model that only updates every quarter.

shikokuchuo commented 3 months ago

My main concern is the lack of a "freeze" period leading up to the snapshot. For a snapshot-based model, the freeze on non-failing packages is important to make sure we have a soft landing every quarter, and it is especially important now that we have decided to remove revdeps of failing packages. I don't think we would ever want to freeze Community.

I see, so if I have this right the plan would be [1] month prior to the release, freeze Staging - only allow packages with issues to update from Community? That would avoid a random update of some package to knock out a swathe of revdeps just prior to release. I think that makes sense.

shikokuchuo commented 3 months ago

Maybe this is where we could ask Posit for help? Or as a last resort, maybe a third universe? I know Jeroen was originally resistant to this, but maybe it wouldn't be such an extra burden in a snapshot model that only updates every quarter.

If we're freezing Staging prior to release, then that necessarily means we're already using the fixed SHAs for that universe. So we could just copy the packages without issues across to Production on the release date. That would be once a quarter to build the binaries and no need to re-run checks on it or anything like that.

To look at it another way - we only need to create Staging at the point of freeze, and after release we have Production. So Staging can be a time-bound universe if that helps conserve resources.

wlandau commented 3 months ago

I see, so if I have this right the plan would be [1] month prior to the release, freeze Staging - only allow packages with issues to update from Community? That would avoid a random update of some package to knock out a swathe of revdeps just prior to release. I think that makes sense.

Exactly. During that month, packages are still free to update in Community, but those in Staging can only update if they have an issue file. As we discussed with Jeroen in our meeting, this prevents surprises prior to release while allowing existing problems to get fixed.

wlandau commented 3 months ago

If we're freezing Staging prior to release, then that necessarily means we're already using the fixed SHAs for that universe. So we could just copy the packages without issues across to Production on the release date. That would be once a quarter to build the binaries and no need to re-run checks on it or anything like that.

I agree.

To look at it another way - we only need to create Staging at the point of freeze, and after release we have Production. So Staging can be a time-bound universe if that helps conserve resources.

Yeah, that could really help. To begin the freeze (which I guess we could call the "staging phase" if we operate this way), we could do a massive update on Staging (bring in all package versions currently in Community). Then spend a month only allowing updates to Staging for failing packages. Then after the Production snapshot is created, either leave Staging alone or take down the Staging universe entirely (whatever Jeroen thinks is most helpful).

shikokuchuo commented 3 months ago

Great! This seems operationally-feasible at least. I agree that we can try to optimize as much as possible for R-universe.

As for the actual snapshots we create as an archive. I wonder if it will be enough to include the source files, in the way CRAN does. Of course with the changes in tooling, this doesn't guarantee they'll build at some future point in time. But just for what we currently have in Community - the source files are 90MB whilst the binaries come in at a hefty 1.6GB.

jeroen commented 3 months ago

I think it makes sense to include binaries for mac and windows for r-release only. So basically the snapshot is tailored for a given release of R.

shikokuchuo commented 3 months ago

Right that makes sense - and would bring the equivalent binaries size down to 520MB for the above case.

@jeroen I also hope you're broadly ok with our suggestion above - which is that we would only need Staging for a month four times a year for the "staging phase" - and then we have a Production universe, which only updates 4 times a year (and no need to run checks between those times).

llrs commented 3 months ago

I agree to limit binaries to r-release versions, but why limit binaries just what CRAN provides? It could provide the ubuntu binaries for r-release too (I think it would help a lot of docker and scientific people as r2u has showed).

shikokuchuo commented 3 months ago

I agree to limit binaries to r-release versions, but why limit binaries just what CRAN provides? It could provide the ubuntu binaries for r-release too (I think it would help a lot of docker and scientific people as r2u has showed).

We could consider, but Ubuntu builds are only available for Noble on R-universe. I don't think we can guarantee that these will work on older versions. Also the default is for source builds on Linux. I think it's safe to say that users on Linux will be comfortable with compiling things from source, whereas there is not necessarily the same expectation on Windows, where rtools may not even be installed.

wlandau commented 3 months ago

One disadvantage of having a universe for production is the lack of guaranteed availability if a package is removed from GitHub/GitLab. I don't see this as an issue in the initial rollout, but later on we may want to switch to something else. Maybe as the effort gains traction and popularity, it will be easier to find a sustainable minimal-cost way to host Production as a snapshot.

shikokuchuo commented 3 months ago

One disadvantage of having a universe for production is the lack of guaranteed availability if a package is removed from GitHub/GitLab. I don't see this as an issue in the initial rollout, but later on we may want to switch to something else. Maybe as the effort gains traction and popularity, it will be easier to find a sustainable minimal-cost way to host Production as a snapshot.

I actually consider this an advantage. Taking down a repo from GitHub / GitLab is a pretty clear decision that the author no longer wants the content published. Just as we track releases once a contribution has been made, we should also 'track' deletions. Of course we'd have to remove it if a request were made directly to us, according to our policy, but I consider this implicit in taking down the repo.

Also in case it wasn't clear, the intention is still to take a snapshot of each release for archival purposes so we can have versioned releases. We can find the best way to make it available for download by corporates etc. but this will be an easier task than web hosting a CRANlike repo.

wlandau commented 3 months ago

Taking down a repo from GitHub / GitLab is a pretty clear decision that the author no longer wants the content published.

Deletion can happen for careless reasons as well. And what if a repo is moved or renamed, e.g. accepted into https://github.com/ropensci, then replaced with different content? Maintainers may find it surprising that deleting or moving/replacing a development repo would have immediate downstream consequences for Production. Maintainers might not even understand what Production means, and they might not understand that packages in Production should stay available, e.g. for serious regulated environments, except for special exceptions.

I agree that a universe may be our best option for Production in our initial rollout, but I would also like to keep thinking about switching to something more enduring later on. Doesn't have to be solved immediately.

Of course we'd have to remove it if a request were made directly to us, according to our policy, but I consider this implicit in taking down the repo.

In the long run, this seems like the right way to handle all deletions from Production.

wlandau commented 3 months ago

Also in case it wasn't clear, the intention is still to take a snapshot of each release for archival purposes so we can have versioned releases.

It may a bit harder to retrieve packages from an archive than from the current production, e.g. if an entire container needs to be downloaded.

wlandau commented 3 months ago

Another long-term concern about Production as a universe: if base R is updated during the year and some packages no longer build, Production would no longer be able to host them. And even if they do build, there may be failing R-universe checks on packages we insist on not updating.

jeroen commented 3 months ago

We should do the same as bioconductor, and plan for each version of our production snapshot which version of base R it will target, and make sure to plan things such that the r-multiverse-staging universe tests with that version of R.

shikokuchuo commented 3 months ago

I agree that a universe may be our best option for Production in our initial rollout, but I would also like to keep thinking about switching to something more enduring later on. Doesn't have to be solved immediately.

Of course we'd have to remove it if a request were made directly to us, according to our policy, but I consider this implicit in taking down the repo.

In the long run, this seems like the right way to handle all deletions from Production.

I'm onboard with both suggestions. Agree that Production probably needs to be handled a bit differently to Community.

wlandau commented 1 month ago

We've implemented this.

r-multiverse / help