apache / arrow-site

Mirror of Apache Arrow site
Apache License 2.0
33 stars 108 forks source link

Add GitHub Id to list of committers.yml #338

Open raulcd opened 1 year ago

raulcd commented 1 year ago

The committers.yml file on the arrow-site repository contains information about the PMC and committers for the Arrow project. The current information is:

- name: committer name
  role: VP/PMC/committer
  alias: ASF alias
  affiliation: Work affiliation

We have implemented a PR automation workflow on the Arrow repository to better keep track of the state of PRs. We assumed the alias on the committers file was the GitHub id instead of the ASF id. In general this is working because lots of members use the same ASF id than their GitHub id. In some cases we are still not correctly identifying the committers. I would like to add the the list of committers also their GitHub id similar to the contributors.yml:

- name: committer name
  role: VP/PMC/committer
  alias: ASF alias
  githubId: The github ID
  affiliation: Work affiliation
raulcd commented 1 year ago

@alamb @kou does this sound reasonable to you?

assignUser commented 1 year ago

Might also be useful for other things to have this mapping available :+1:

alamb commented 1 year ago

@alamb @kou does this sound reasonable to you?

Yes -- that would be super helpful. Thank you @raulcd

Related to this, I think someone (@bkmgit maybe?) was looking into potentially scraping this information from the ASF phonebook (which has github name and other things)

Committers https://people.apache.org/phonebook.html?unix=arrow

PMC https://people.apache.org/phonebook.html?ctte=arrow

To get the github user data seeded, perhaps you could write a script or something that scraped the ASF phonebook and updated commiters.yml 🤔

alamb commented 1 year ago

According to https://people.apache.org/phonebook-about.html

All the data is in https://whimsy.apache.org/public/public_ldap_groups.json -- maybe we don't even have to do any scraping (just transform to yml) 🤔

kou commented 1 year ago

Yes. This sounds reasonable.

How about renaming alias to asf and githubId to github?

- name: committer name
  role: VP/PMC/committer
  asf: The ASF ID
  github: The GitHub ID
  affiliation: Work affiliation

(If we want to add ID to key, I prefer asf_id/github_id style to asfId/githubId style because Jekyll uses xxx_yyy style.)

pitrou commented 1 year ago

When doing this, please make sure the check_committers.py script remains functional.

bkmgit commented 1 year ago

The GitHub Id and affiliation are not in the public Apache phonebook or on whimsy.apache.org Maybe these should be added manually? A reasonable default is to check for a GitHub id corresponding to an Apache ID and leave affiliation as blank/not stated.

bkmgit commented 1 year ago

It seems reasonable to do some matching based on members of Apache on GitHub or the commit history. Some people may just use GitBox and so not have a GitHub id.