Proposal: Reduce the Entropy of TiKV Community Governance

tisonkun commented 3 years ago

Subtasks:

[x] #121
[x] #123
[x] #124
[x] #134

What is the problem our community meet?

Most of our repositories have no clear owners; for example, it is hard to say who has review or commit permission of mur3 by our governance and community information.
Special Interest Group benefits domain-specific discussion, but sucks when coupled with governance; for example, it is hard to say a contributor contributing to TiKV security belongs to which SIG. It is to be blamed for that many of our committer level contributors (e.g., @innerr) don't become committers really.
We have no active exit mechanism with which we take down review or commit permission of member abuses. And thus our community is a bit closed to involve new member, and break down members into SIGs even in one repository for limiting permission which sucks.

What is proposed to do?

Break down the review and commit permission group into teams, one team could own one or more repositories.
Associate every repository with one and one and only one team.
Define the team governance to include explicit exit mechanism, and community governance to build new team.

Details about actions

Break down the review and commit permission group into teams.

A team has its reviewer, committer and maintainer, and owns one or more repositories.

The permission of reviewer and committer is similar with current state but indistinguishable among the team. It means a reviewer of the team can review all pull requests in repositories owned by the team.

The permission of maintainer, based on that of committer, is mainly for exit mechanism and other team level decision making, which will be elaborated later.

Within this proposal, there are 4 teams proposed.

TiKV team
PD team
Prometheus team
Raft team

Associate every repository with one and one and only one team.

Every repository of TiKV org will be associated with one and only one team. Reviewer and committer of that team has the review or commit permission to the repository.

As we build 4 teams above, the relation between team and repository as listed below. Comment if you think it is not the real case.

PD team

pd

Prometheus team

rust-prometheus

Raft team

raft-rs

TiKV team

tikv
client-c
client-py
client-go
client-java
client-rust
client-cpp
agatedb
async-speed-limit
bytes
client-validator
copr-test
crc64fast
crossbeam
fail-rs
grpc-rs
importer
jemalloc
jemallocator
minitrace-go
minitrace-rust
minstant
mock-tikv
mur3
pprof-rs
procinfo-rs
protobuf-build
raft-engine
rocksdb
rusoto
rust-rocksdb
sig-transaction
sysinfo
terraform-tikv-bench
tikv-operator
tikv.github.io
titan
tlaplus-specs
website
yatp

The exceptions are tikv/community and tikv/rfcs.

tikv/rfcs is associated with all teams which means we put rfcs together among teams.
tikv/community is for governance and only the maintainers has permission.

Define the team governance to include explicit exit mechanism, and community governance to build a new team.

A team has its reviewer, committer and maintainer.

A reviewer is able to review pull request for quality and correctness of repositories associated with the team. His/Her LGTM is counted.
A committer is able to both review and commit pull request.
A maintainer, based on the rights of a committer, is able to promote or evict(emeritus member for reference) a reviewer, a committer or a maintainer by majority decision.

The maintainers is able to evict member by majority decision and thus we have an exit mechanism. It is suggested that a team defines its own bylaws to elaborate other team level decision making process.

Besides the teams, the whole TiKV community has its TOC members for evolving the community. The TOC members consist of part of maintainers of teams and are in charge of community level decision making process like build a new team, or arbitrate divergence between team by majority decision.

Where do current things go?

Here is the migration details from current SIG membership to team membership.

migration

sig-scheduling migrates to PD team
sig-raft migrates to Raft team/> sig-engine / sig-coprocessor / sig-transaction migrate to TiKV team

techLeaders and coLeaders migrate to maintainers
committers migrates to committers
reviewers migrates to reviewers
activeContributors are dismissed
specifically, inactive previous TiKV maintainers(@lidaobing, @siddontang, @sunxiaoguang, @winkyao, @fredchenbj) have the opportunity to be an active TiKV maintainers and thus TOC members by asking for it, otherwise they become emeritus TOC members.

Where does SIG concept go?

SIG itself means expert group, which always exists in fact, and serves for better reference to help. However, this concept should never be associated with permission.

Shall we add a new concept / infra named "Team"?

Yes and no. We name it as "Team" for distinguished from "SIG", but the authentication #123 is quite similar as "SIG at repo granularity" or said GitHub teams. There is no much difference.

tisonkun commented 3 years ago

How about leaving just one sig-tikv? (by @sykp241095 from gist)

There is a TiKV Project after this proposal applied. The difference is that we also have PD project and Raft project since they seem quite independent from TiKV projects. Hadoop, ZooKeeper, and Ratis are different project and I think it is similar to TiKV, PD, and Raft. BTW, this comes from the offline discussion with @BusyJay .

tisonkun commented 3 years ago

Are we going to break down SIGs? (by @breeswish from offline)

Nope. We define a community governance struct that project owns repositories, and you might regard it more as SIG combination. The details of projects and the mapping from repository to project is as above.

Rationale

We should decouple community governance from enterprise organizational structure where SIG is born. Because the community governance reflect the community itself and the evolution. SIG Engine exists because there is a component, not because there is a team inside a company working for it.

I do agree theoretically it is possible to maintain OWNERS files to keep community governance reflect the software structure in fine-grained. But it costs too much and I do not think TiKV is gonna need it. We can use a simpler and permissive model, that is, the project model, to avoid the mismatch problem. All TiKV reviewers and committers are equal to react to all events happen in the project. We do never separated ourselves unnecessarily, especially inside a repository.

tisonkun commented 3 years ago

cc @BusyJay @yiwu-arbug @innerr @zhangjinpeng1987 @nolouch

andylokandy commented 3 years ago

cc @hi-rustin

tisonkun commented 3 years ago

If there is no objection on the general proposal, I'll elaborate concrete actions and call for applying.

yiwu-arbug commented 3 years ago

How existing SIG structure works under the new project-based governance? Does SIG specific maintainers and reviewers still being maintained, or transfer to become project maintainers and reviewers?

Rustin170506 commented 3 years ago

and you might regard it more as SIG combination.

If this is actually a combination of SIG, then I would prefer it to follow the concept of SIG from the point of view of infrastructure changes, so that there are minimal changes to the whole infrastructure.

If this part introduces the concept of project, then the community's website will need to reorganized, as both tikv and tidb SIGs are now displayed on the website. We may need to redesign the website to show the concept and architecture of project. The architecture of the bots also needed to be adapted, including ti-community-bot and tichi. This required a huge cost to do the migration.

tisonkun commented 3 years ago

How existing SIG structure works under the new project-based governance? Does SIG specific maintainers and reviewers still being maintained, or transfer to become project maintainers and reviewers?

short answer: "transferred"

@yiwu-arbug you can see the details of migration above for a proposal.

I think the major issue is that SIG concept should be barely special interest group, not associated with permission. As in the Apache Spark community there are expert groups (SIGs) for SparkR / Runtime / Streaming, but all permissions are flatten to Spark project.

With an offline discussion with @winkyao there is one more issue : a number of current maintainers and reviewers are already inactive, we should help them emeritus.

tisonkun commented 3 years ago

and you might regard it more as SIG combination.

If this is actually a combination of SIG, then I would prefer it to follow the concept of SIG from the point of view of infrastructure changes, so that there are minimal changes to the whole infrastructure.

If this part introduces the concept of project, then the community's website will need to reorganized, as both tikv and tidb SIGs are now displayed on the website. We may need to redesign the website to show the concept and architecture of project. The architecture of the bots also needed to be adapted, including ti-community-bot and tichi. This required a huge cost to do the migration.

@hi-rustin Name problem is significant. We can focus the permission migration at first but a subtask for name problem will be created and push to done.

permission migration
bot project name adaption
website updates

Rustin170506 commented 3 years ago

permission migration

bot project name adaption

Technically speaking, the two steps are inseparable. At least on tikv, they are inseparable.

Rustin170506 commented 3 years ago

Name problem is significant.

I remember you saying in the offline discussion that it didn't matter what you called it, so I prefer to follow the concept of SIG rather than introduce a new concept.

tisonkun commented 3 years ago

Name problem is significant.

I remember you saying in the offline discussion that it didn't matter what you called it, so I prefer to follow the concept of SIG rather than introduce a new concept.

Fine. I can see the overhead but still insist that permission rules changed and we'd better avoid override concepts on the same name.

I'd emphasize that project is different from sig.

Member : Project = 1:n
Project : Repository = 1:n

A reviewer / committer has permission to a PR of a repository because he / she is one of members of the project the repository belongs to.

So far, sig works on labels / default sig / default to all sigs. And permission works in addition with trust team / trust member. We don't need too much. They are legacy patches for the wrong permission model.

Rustin170506 commented 3 years ago

A reviewer / committer has permission to a PR of a repository because he / she is one of members of the project the repository belongs to.

Can't we just change the sig's permission logic to this?

Rustin170506 commented 3 years ago

It seems like we are just merging sigs and assigning permissions to those sigs according to the level of the repo?

tisonkun commented 3 years ago

A reviewer / committer has permission to a PR of a repository because he / she is one of members of the project the repository belongs to.

Can't we just change the sig's permission logic to this?

Yes we can. That is what I mean in

>permission migration
bot project name adaption
website updates

I'll remind that TiDB using sig model also and you cannot change at once. That means, base on "They are legacy patches for the wrong permission model.", you are adding new patches.

Rustin170506 commented 3 years ago

That means, base on "They are legacy patches for the wrong permission model.", you are adding new patches.

If it's essentially merging sigs and managing permissions by repo level, then I don't see it as adding a patch, but rather correcting our original faulty permission model according to your logic? Because currently in our community, sig is already tied to permissions.

tisonkun commented 3 years ago

It seems like we are just merging sigs and assigning permissions to those sigs according to the level of the repo?

Definitely, you can see also https://github.com/tikv/community/issues/118#issuecomment-845086247 .

In additionally we may enable emeritus but that is easy because we can add a file which does not participant permission model.

Said

- tikv project
  - member.json
  - emeritus.[ext]

or

// member.json
{
  "emeritus-committers" : [ "..." ],
}

Rustin170506 commented 3 years ago

I don't understand why we need to introduce new concepts if essentially what's going on is merging sigs and adjusting the permission model?

tisonkun commented 3 years ago

I don't understand why we need to introduce new concepts if essentially what's going on is merging sigs and adjusting the permission model?

I don't insist a new name but focus on the permission model and keep member permission configs done inside tikv org (project : repository could be done in ti-community-infra/config and I think it is how ASF does it).

BusyJay commented 3 years ago

If committers and reviewers of SIGs are transferred to TiKV, then there will be no SIGs anymore, is it correct?

If I have write permission in a repository, then I should also have write permission in other repositories within the same project, is it correct?

tisonkun commented 3 years ago

@BusyJay

If committers and reviewers of SIGs are transferred to TiKV, then there will be no SIGs anymore, is it correct?

See "Where do current things go?" section:

SIG itself means expert group, which always exists in fact, and serves for better reference to help. However, this concept should never be associated with permission.

Other community example, Apache Spark community has expert groups (SIGs) in fact for SparkR / Runtime / Streaming, but all permissions are flatten to Spark project.

If I have write permission in a repository, then I should also have write permission in other repositories within the same project, is it correct?

Yes and no. You can follow the term

"I am a committer of TiKV project, then I have write permission to repositories belongs to TiKV project, like tikv/tikv, tikv/client-rust, and so on."

"No" because we don't grant permission directly to a person in repository level.

BusyJay commented 3 years ago

...However, this concept should never be associated with permission.

Then what's point of becoming a committer/reviewer of TiKV that can't write/review any code in TiKV org?

"No" because we don't grant permission directly to a person in repository level.

For repository like rust-prometheus that is maintained by community, how to grant the write permissions for community members not in TiKV project (and any SIGs)?

tisonkun commented 3 years ago

Then what's point of becoming a committer/reviewer of TiKV that can't write/review any code in TiKV org?

Any misunderstanding? If you are a committer of TiKV "project", you have the write permission of all its repositories, including tikv/tikv, tikv/client-rust, and so on. As listed above in details of "TiKV project"

"No" because we don't grant permission directly to a person in repository level.

For repository like rust-prometheus that is maintained by community, how to grant the write permissions for community members not in TiKV project (and any SIGs)?

As listed above in details of "TiKV project", it belongs to TiKV project.

UPDATE: after an offline discussion with @BusyJay , we agree with that current SIGs will be transferred to projects. SIGs won't exist as part of community governance and thus there is no committer / reviewer of a SIG any more.

For "rust-prometheus" part, it might not belong to TiKV project and deserve its own project like Raft project for raft-rs. We will discuss in the subtasks.

tisonkun commented 3 years ago

I will then start the migration subtasks with confirmation from stakeholders. Let me know if you would like to participant in the migration committee and thus help on decision making. Or other stakeholders who should be included.

@BusyJay @yiwu-arbug @zhangjinpeng1987 @nolouch @innerr @hi-rustin @skyzh @breeswish @andylokandy @youjiali1995 @winkyao @sunxiaoguang

Rustin170506 commented 3 years ago

Other community example, Apache Spark community has expert groups (SIGs) in fact for SparkR / Runtime / Streaming, but all permissions are flatten to Spark project.

To add, we originally referenced cncf's sig architecture, and their sig is tightly tied to permissions. For example: https://github.com/kubernetes/community/tree/master/sig-network#subprojects

Rustin170506 commented 3 years ago

SIGs won't exist as part of community governance and thus there is no committer / reviewer of a SIG any more.

To clarify, we will not remove the concept of SIG directly, we will follow the concept of SIG in the implementation. The projects we are talking about now will become the new SIGs, e.g. sig-tikv, sig-raft, sig-pd.

The reasons for this I discussed with @tisonkun

what we are essentially doing is changing the permission model and merging some sigs, so it doesn't matter what the implementation is called. Our sig has been running for a long time and is bound with permissions, so we can stick with the sig concept. Instead of making huge changes.
because our current tidb and tikv communities are governed together, and introducing new concepts would lead to confusion in the community governance. Also the introduction of a new concept itself brings learning and promotion costs.
If new concepts are introduced, the whole community infrastructure will have to be changed, ,including bots, community website, and governance framework. (Introducing the new concept required a re-implementation of almost all permission-related tools and took at least 1-2 months to implement.)

tisonkun commented 3 years ago

Other community example, Apache Spark community has expert groups (SIGs) in fact for SparkR / Runtime / Streaming, but all permissions are flatten to Spark project.

To add, we originally referenced cncf's sig architecture, and their sig is tightly tied to permissions. For example: https://github.com/kubernetes/community/tree/master/sig-network#subprojects

@hi-rustin you can always regard project as a new sig, and we should stop battle on names.

As you can see k8s's subprojects are well formed in repos and subdirs. The repos part is the same as what proposed here. The subdir parts are tightly coupled with how Golang project construct codetree. It is not quite easy to use for general code repositroy.

Rustin170506 commented 3 years ago

@hi-rustin Name problem is significant.

@hi-rustin you can always regard project as a new sig, and we should stop battle on names.

🤣 Okay I will stop battle on names.

innerr commented 3 years ago

LGTM

nolouch commented 3 years ago

I very glad to see this proposal, the permission issues are also bothering me. sometimes one feature may cross the sig or the project, but permission delayed our review work. but I have some question about how to work with the new proposal:

I more agree with @hi-rustin that tidb and tikv communities should be governed together. Such as, how do we do the cross-project works?
And we may need more detailed instructions on how to obtain the corresponding permissions. Am I already a committer of the sig, that is the committer of this project? and How about the newcome?
As the above discussion, will a project be divided into many sub-projects, It's may build some wheels for the project, such as the rust-prometheus above. Will it meet the sig problem again?

skyzh commented 3 years ago

Generally looks good to me. This solves the long-lasting permission issues for us. ~~And I hope that kvproto could later be part of the TiKV project.~~

But this proposal casts doubt on how new contributors engage with our community. Who should they ask for a review after submitting PR? Previously we have sig-copr, so they could directly connect to us. Now in the new model, all of the committers and reviewers belong to the TiKV project, and this makes it unclear to know who has prior experience on a component and who can review a PR.

tisonkun commented 3 years ago

@nolouch

I more agree with @hi-rustin that tidb and tikv communities should be governed together. Such as, how do we do the cross-project works?

We first apply the governance rule on tikv org and popularize it to the tidb community for better collaboration. tidb community suffers from unclear sig-repo permission model also.

We finally reach a place that, for example, you @nolouch are both a committer of pd and tidb and thus has the permission to commit to tikv/pd and pingcap/tidb.

And we may need more detailed instructions on how to obtain the corresponding permissions. Am I already a committer of the sig, that is the committer of this project? and How about the newcome?

Yes. We will elaborate details in subtasks when we reach consensus on the overall direction described here.

For you question, with an offline discussion with @BusyJay and you can see the sentence above, we agree with that we have checkpoint (mandatory process) on "majority decision by maintainers" to promote a reviewer / committer. And every project can has its guideline (non-mandatory principle) for how to become a committer.

Please be aware that sig-scheduling, sig-coprocessor has gone, at least not associated with permission anymore.

As the above discussion, will a project be divided into many sub-projects, It's may build some wheels for the project, such as the rust-prometheus above. Will it meet the sig problem again?

Nope.

The problem of SIG on permission model is we use multiple rules on review / commit permission

sig-label
default sig
default to all sigs
trust member
trust team

After the migration we have only one rule that if and only if you are a member of projects this repository belongs to, you have the corresponding permission to the repository.

tisonkun commented 3 years ago

But this proposal casts doubt on how new contributors engage with our community. Who should they ask for a review after submitting PR? Previously we have sig-copr, so they could directly connect to us. Now in the new model, all of the committers and reviewers belong to the TiKV project, and this makes it unclear to know who has prior experience on a component and who can review a PR.

For auto assign or domain expert issue, we can make use of GitHub CODEOWNERS. And for open source community we always overcome this issue by over communication. We have a number of members and one of members notice a new PR he/she will help to dispatch.

... or if the project agrees on it, it can post an expert map as in scala community, but the map itself is a hint, not associated with permission.

@hi-rustin may have more insights on auto assign or dealing with stale pr.

Rustin170506 commented 3 years ago

The problem of SIG on permission model is we use multiple rules on review / commit permission
1. sig-label

2. default sig

3. default to all sigs

4. trust member

5. trust team
After the migration we have only one rule that if and only if you are a member of projects this repository belongs to, you have the corresponding permission to the repository.

Regarding the trust user and trust team part, I would like to know how the release team will be handled? Is it added directly to each project?

Rustin170506 commented 3 years ago

@hi-rustin may have more insights on auto assign or dealing with stale pr.

Technically we could use CODEOWNERS, but as mentioned above CODEOWNERS are very expensive to maintain.

Another possible solution would be to automatically assign them based on the history of changes to the code, but it may still not be accurate because some people just used to work on certain modules and are no longer responsible for and maintain that part of the code.

Overall, this problem gets worse when permissions are scaled up. Unless we maintain and update the owners file (either GitHub's own or k8s' solution).

But I understand the cost of maintaining CODEOWNERS and k8s owners is about the same.

tisonkun commented 3 years ago

@hi-rustin if someone helps a lot on the release of a project, he/she undoubtedly deserves as a committer.

For blocking merge on releasing branch, I think it is another mechanism and not the main stream of community affairs.

That is, the release manager coordinate the release process and cheery-pick-approved is a tool helps the release manager.

tisonkun commented 3 years ago

Overall, this problem gets worse when permissions are scaled up. Unless we maintain and update the owners file (either GitHub's own or k8s' solution).

No. For now you maintain the membership file or CODEOWNERS file in pingcap/tidb already. It doesn't get worse.

Rustin170506 commented 3 years ago

No. For now you maintain the membership file or CODEOWNERS file in pingcap/tidb already. It doesn't get worse.

I see a lot of this operation in pr, and it doesn't actually work.

Rustin170506 commented 3 years ago

@hi-rustin if someone helps a lot on the release of a project, he/she undoubtedly deserves as a committer.

Got it, so we'll add the core members of the release team to each project.

tisonkun commented 3 years ago

Got it, so we'll add the core members of the release team to each project.

I don't like this sentence. We promote people who helps a lot on the release of a project the project committer.

No "core members of the release team".

nolouch commented 3 years ago

looks good to me!

tisonkun commented 3 years ago

I'm going to break down this proposal as several subtasks now...

tisonkun commented 3 years ago

I'm going to break down this proposal as several subtasks now...

Done as

121
123
124

Audiences of this issue please take a look.

BusyJay commented 3 years ago

@winkyao do we need review from CNCF side?

winkyao commented 3 years ago

@BusyJay Governance review by maintainer is enough, I will look into these discussions later. thanks

zhangjinpeng87 commented 3 years ago

LGTM

sunxiaoguang commented 3 years ago

LGTM

winkyao commented 3 years ago

LGTM

zz-jason commented 3 years ago

+1, the proposal LGTM.

Another problem that hasn't discussed is how to efficiently manage github issues and pull requests among multiple repos for a project. For example, tikv-project has the following repos as described in the proposal, it's too hard to maintain issues pull requests on these repos:

tikv
client-c
client-py
client-go
client-java
client-rust
client-cpp
agatedb
async-speed-limit
bytes
client-validator
copr-test
crc64fast
crossbeam
fail-rs
grpc-rs
importer
jemalloc
jemallocator
minitrace-go
minitrace-rust
minstant
mock-tikv
mur3
pprof-rs
procinfo-rs
protobuf-build
raft-engine
rocksdb
rusoto
rust-prometheus
rust-rocksdb
sig-transaction
sysinfo
terraform-tikv-bench
tikv-operator
tikv.github.io
titan
tlaplus-specs
website
yatp

tisonkun commented 3 years ago

For example, tikv-project has the following repos as described in the proposal, it's too hard to maintain issues pull requests on these repos:

@zz-jason

For pull request, it is not quite hard because pr is created where code changes. The difficult part would be how to coordinate related pull requests.

For issue part, yes. From my experience in Flink community, we use a single JIRA board to trace all issues and dispatch to repos. If we directly follow this pattern, we can close all issue function on satellite repositories and open on tikv/tikv as the issue board.

But let's say it is an issue tracking and code collaboration topic which could be eased by best practice sharing on that domain. It doesn't block this topic.

If you are troubled on this kind of problem, I'd suggest you create a topic on our discussion forum and I'll be glad to discuss deeper.

tikv / community