cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
19.26k stars 2.79k forks source link

Document developer expectations for usage of labels #32269

Open joestringer opened 2 months ago

joestringer commented 2 months ago

There's a few labels that we tend to use in the Cilium development community which have unclear semantics. This task is to document the expectations around their usage and drive consensus amongst developers to ensure consistency.

In particular, there tends to be a lack of clarity around whether the following labels represent a request to potentially backport or block a release (which requires a response), or whether they represent a statement that evaluation of severity has occurred, hence requiring adherence by the relevant parties (rotating backporter, release managers).

Labels:

joestringer commented 2 months ago

Another example that came up during the Cilium weekly developer meeting today is for ready-to-merge. Proposal was to have a merge-proposal label that developers can set, which has clearer intent: A developer should set this to request that a PR should be merged regardless of the CI results. Then, @cilium/tophat can look for PRs with this label and assess whether the PR is ready to merge or not and potentially merge it. This could be useful in cases where Ariane is not yet advanced enough to detect the impact of some changes, for instance because the PR only makes changes to code comments. This would contrast with ready-to-merge, which is set by a bot when all requirements have actually been completed.

joestringer commented 2 weeks ago

Specifically regarding release-blockers, I'm exploring the potential use of GitHub projects to track proposals and status here: https://github.com/orgs/cilium/projects/51/views/1 .

I'm thinking somewhat of a process like the following:

  1. A contributor adds a blocker label to the PR, it gets added to the project as a "Proposed" blocker.
  2. At any time, if someone is assigned to the issue, we move it to "Assigned".
  3. Periodically or during a community meeting, we can revisit proposed blockers. If we agree it should block the release and there is a volunteer to work on the issue, assign the person, set the target milestone and move it to "Assigned".
  4. If there is more than a few days of inactivity (let's say 3 weekdays), it'll move to "Inactive".
  5. Other activity (eg PR is created) will move the issue/PR to "Active".
  6. Once an issue or PR is closed, it is moved to "Done".

We can then also periodically evaluate whether "inactive" issues should remain in the list of blockers. If an issue is not making progress and yet is marked as a blocker, then this is a sign of a discrepancy between the assignee's priorities and the project priorities, so either we should work with the assignee to understand the impact+timelines, or re-evaluate the assignee or even blocker status.

I'd also like to integrate severity into this in some way. Perhaps something along the lines of the following:

  1. severity/low: Impact is limited, perhaps only affecting some configurations, or affecting all configurations with acceptable repercussions in a production environment.
  2. severity/medium: May force users to take mitigating actions in common scenarios.
  3. severity/high: Widespread impact to common deployment configurations.

The above is a rough proposal of the practical workflow which we can further document once we've tested it out and get into the groove of managing the projects.

While the above is thinking more through the release-blocker aspect highlighted by the issue description, I think that these guidelines may also be relevant for backports. I'm not yet looking into backports just because they have a higher throughput of issues and they are generally more automated for now with fewer maintainer touch-points. After evaluating some process patterns for release blockers, we could look into expanding some of the usage into backports.