cockroachdb / docs

CockroachDB user documentation
https://cockroachlabs.com/docs
Creative Commons Attribution 4.0 International
188 stars 456 forks source link

Update release notes script to remove names of CRL team member contributors #8030

Open lnhsingh opened 4 years ago

lnhsingh commented 4 years ago

Lauren Singh commented:

In release notes, we don't include contributors' names if they are a CRL team member. Currently, removing them from the Contributors section is manual. We should probably automate this.

@yingzhuchin mentioned that we might be able to do this reliably since the Security teams recently locked down GitHub permissions and might maintain a source of truth of CRL employees on GitHub. If possible, it would just mean getting that list and feeding it into the script.

cc @knz, who might also have ideas on how to make the script more reliably filter out CRL employees.

Jira Issue: DOC-645

knz commented 4 years ago

The "security list" is no good unfortunately, because this gives us github usernames and CRL e-mail addresses, but that's not what is used for commit messages (folk use their non-CRL contact info in git commits, and github usernames are not present in the git history anyway).

The filtering is done based on the AUTHORS file at the root of the repository. This is supposed to be maintained up-to-date by having first time employees edit that with their info. If they don't someone else has to add them.

In the past, I was manually adding missing folk in there every 6 months.

I could see a world where the doc writer who sees someone mis-treated by the script takes the initiative to add the missingr data to AUTHORS manually prior to running the script.

The rule is simple:

  1. if someone uses their CRL e-mail address in-clear in the git history, add that as-is to AUTHORs
  2. if someone uses their non-CRL e-mail address in-clear, then add that as-is to AUTHORS, and also add the text "<@cockroachlabs.com>" (no email prefix) to the line.

    This indicates that they are a team member, but does not reveal their CRL address (presumably, they wish to keep it off github)

  3. if someone uses an anonymized address in the git history (that's an option), then add the anonymized address in AUTHORS, and also add <@cockroachlabs.com> as explained above.

Here are three examples:

Aaron Blum <aaron@cockroachlabs.com>
Aditya Maru <adityamaru@gmail.com> <@cockroachlabs.com>
Tim O'Brien <38867162+tim-o@users.noreply.github.com> tim-o <38867162+tim-o@users.noreply.github.com> <@cockroachlabs.com>

What this tells us:

  1. Aaron is team member and uses his CRL address as-is.
  2. Aditya uses his own address as-is, wants to keep his CRL address private, is marked as CRL team member
  3. Tim wants to keep all e-mail addresses private. Is marked as CRL team member.

Here's an example broken entry:

Ryan Kuo <8740013+taroface@users.noreply.github.com> taroface <ryankuo@gmail.com>

(Should be adding <@cockroachlabs.com> in there)

yzdocs commented 4 years ago

Thanks @knz! Very helpful.

exalate-issue-sync[bot] commented 2 years ago

Ian Evans (ianjevans) commented: When I generated the v21.2.2 release notes I used the {{--hide-crdb-folks}} option, but the raw output included Cockroach Labs team members.

{noformat}- Alex Santamaura (first-time contributor, CockroachDB team member)

You can reproduce it with this command:

{noformat}/Users/ian/.pyenv/shims/python3 /var/folders/r5/7q2jzh4n0rj5lgcwc47st58m0000gp/T/cockroach/scripts/release-notes.py --from=d14d5f5d3a47b97cae149a43c590708008d2b5d3 --until=af7f257bc39988e6d98d516d7fe7cb842a22ad67 --prod-release --one-line --hide-unambiguous-shas --hide-per-contributor-section --hide-crdb-folk{noformat}

So something is off with the {{--hide-crdb-folk}} option and/or the step to calculate first time contributors. It clearly should have hid Alex, David, and Rima. But Jane is also in {{AUTHORS}} and Andrew Werner is on here by his username.

knz commented 2 years ago

It's possible that some of these folk have used e-mail addresses that are not yet in the AUTHORS file. We should use git log and inspect manually for at least one of these names what happened. Otherwise, we can spend some time together to troubleshoot.