iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.8k stars 604 forks source link

Update GCP GitHub Actions runners to use default runner group #17893

Closed ScottTodd closed 1 month ago

ScottTodd commented 3 months ago

We were using "self-hosted runner groups" (https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/managing-access-to-self-hosted-runners-using-groups) to organize our self-hosted runners. We just switched from the GitHub "enterprise" plan but back to the "free" tier, where runner groups are not available. All but the "Default" runner group has been disabled as a result.

I manually moved self-hosted runners from the "presubmit" and "postsubmit" runner groups back to the "Default" group to keep them working, but the autoscaling / ephemeral GCP runners were added back to the disabled groups.

Looks like the groups are set in https://github.com/iree-org/iree/blob/main/build_tools/github_actions/runner/gcp/create_templates.sh and also used in https://github.com/iree-org/iree/blob/main/build_tools/github_actions/runner/config/register.sh.

We may want to keep the group labels, or we could just get rid of them entirely. Separate groups allow us to prevent presubmit jobs from starving postsubmit jobs and vice-versa, as well as limit which runners touch unsubmitted code.

yuennancy commented 3 months ago

What's the ask? To change the group to Default in the scripts and keep the labels?

ScottTodd commented 3 months ago

Yeah, that's probably where I'd start.

As we rework CI jobs we might also want to get rid of the distinction between a presubmit runner and a postsubmit runner.

ScottTodd commented 3 months ago

I guess we could also keep the code unchanged and continue to use the "disabled" groups, assuming GitHub doesn't make them nonfunctional 🤔

ScottTodd commented 1 month ago

Most GCP runners are disabled now (see https://github.com/iree-org/iree/issues/18238). The runner groups are nearly empty (just ARM64 runners in there now)