Open carterbox opened 9 months ago
Thanks Daniel! 🙏 Good point
Are there other bots we would want to add @beckermr ?
@jaimergp what do you think about adding bots here?
This is all we want for now. We'll need to figure out how the server interacts with GitHub apps.
Hm, it might be a gateway for any user to trigger the CI without actually having signed the TOS, so I'm not sure we should authorize bots. It might be a limitation, where an authorized user needs to push an empty commit right after the bot has done its thing.
Interesting that might affect automerge
as well since that is another bot commit
Maybe we can mull on this one some more and see what options we can come up with?
Yep, of course. Not saying this will never be possible.
There might be solutions we can implement at the trigger validation level, but right now, with what we have, since any Github user can trigger a bot command (rerender, add maintainer, update version, etc), adding conda-forge-admin as a valid user is the same as allowing everyone.
We will have to document the automation limitations so people are not super surprised.
PS: I do hope that folks do not rely on automerge for 6h+ builds though hah.
If there was a good way to have users' own authentication used with bots, we might be able to register things like re-rendering commits or automerge commits with users
That said, this may involve work on the conda-forge side, the Quansight side, and maybe talking to GitHub to work through issues that come up
Maybe that would be one path
As to +6hr builds and automerge, interestingly long running builds are almost best managed with automerge. Long running builds are slow to get feedback from and can be forgotten about while doing other work. It was one of the things that prompted the automerge addition in conda-forge
Though with such long running builds we may want to think about scheduling as well (or I guess some things can stay pending for a while)
One more thing to think about I guess 🙂
We'd have to ask users to install a conda app in order to act on their behalf. An easier approach might be to use one of our bots only for certain actions that would trigger this ci and then gate when we use this bot.
Ok cool we have a couple ideas
Are there any others we have not yet considered?
Also remember we need to gate keep things from a resource constraint pov as well, this a single server with a small number of available runners and no scaling ability.
This is good to keep in mind Dharhas
For more context on the bot cases mentioned above, these are a maintainer takes an action where a bot assists them. The maintainer could take these actions themselves. Also the action taken will simply start a set of jobs equivalent to a single commit on a single repo (so no combinatorial explosion)
Hope that helps clarify things. Happy to help answer any questions 🙂
A maintainer who has not agreed to the TOS could automerge a pr by another maintainer who has if we add the bots.
Think that falls under the authorization discussion above (and how we handle that)
Understood Dharhas' concern to be about resource oversubscription, which is orthogonal
Under the current conda-forge model don't see automerge or re-render presenting this sort of risk
A migrator could be a source of risk (unless we regulate how many PRs it opens)
Also a maintainer pushing individual commits in rapid succession
Does that make more sense?
To this end what sort of throttling logic does the CI have? Is autocancellation available (so only the latest change keeps running)?
Is autocancellation available (so only the latest change keeps running)?
Yes, via the conncurrency settings in the GHA workflow.
A migrator could be a source of risk (unless we regulate how many PRs it opens)
Yes, but it's not if we ask maintainers to push an empty commit to trigger the extra CI if they do need it. I am assuming that the GPU server will not be the only CI provider in most feedstocks, so debugging can happen in the regular Azure runners and then the GPU ones are enabled only when certain it will work. This assumption might be wrong, but tbh it might be the only natural way given the current limitation of 4 concurrent GPU jobs.
I think the way that CuPy protects its CI is that jobs always have to be manually started by commenting on the PR with a specific command e.g. /test [type of test]
.
Requiring an authorized user to write conda-forge-admin, please run ci
in a comment for every job that is queued here from a PR would be the most conservative and intuitive method I think. Having to push an empty commit seems like friction?
We've done similar things in RAPIDS and Dask for running CI
It can still cause friction ofc, but I think our goal is merely to minimize friction (not necessarily eliminated given our constraints). To that end agree a comment is less friction
We can probably add this in the GHA workflow directly, and then have Cirun allow the pull request comment trigger. Better than an empty commit, I agree.
To obtain access to the CI server, you must complete the form below:
access/*.json
file.If the conda-forge-admin will be opening PRs/migrations on feedstocks that use this CI, then it should also have permission to use this CI?
https://github.com/conda-forge/libmagma-feedstock/pull/14