GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
765 stars 219 forks source link

Discuss and document the Pull Request review process #1113

Open seisman opened 3 years ago

seisman commented 3 years ago

During the release v0.1.0 to v0.2.1, @weiji14 and I were the only active maintainers of the project. We usually submitted PRs, waited for approval from the other one, and then merged PRs immediately after approval.

Now the team is growing quickly, and we have more maintainers and contributors commenting on PRs, so we usually do NOT merge PRs immediately after one approval. The review process helps make the project better, but also becomes longer. I'm not sure if we have reached a consensus about when a PR should be merged and who should push the merge button (currently most PRs were merged by @weiji14 and me). New contributors may also be confused why their PRs are already approved but still not merged.

So I feel we should discuss the PR review process and document it in the contributing/maintenance guides.

Here is the PR review process I think we're loosely following:

Thoughts and comments on the review process?

maxrjones commented 3 years ago

Thanks for starting this conversation! In addition to the aspects you mentioned, I would find it helpful to get some clarity on people's preferences for if/when to request specific reviewers (rather than generally marking 'ready for review'). Only people with write access can request reviewers, so this isn't relevant for all PRs. Still, the review request process for PyGMT confused me at first regarding whether unsolicited reviews on PRs are welcome (hopefully they are). I guess my opinion is that specific reviewer requests are only needed for specialized PRs (e.g., early development on dvc, workflows) and that more general docs/code PRs don't require reviewer requests (especially since they seem to always get reviewed really fast anyways).

The self-assign functionality for issues could be a worthwhile option to mention in the maintenance guide for people with write access. At least I find this really useful for keeping track of my tasking - not sure how others feel about it.

maxrjones commented 3 years ago

Merge the PR if approved by X reviewers or no further comments after X hours

I am skeptical that one rule could apply to all PRs. For example, I would prefer to see more opportunity for people to comment if #379 gets a 'final review call' (order days) since it's a outward facing change, while I would really be fine if #1110 were merged immediately after one approval.

weiji14 commented 3 years ago

who should push the merge button (currently most PRs were merged by @weiji14 and me)

Just on this, I think that people with maintenance permissions should get into the habit of merging in their own Pull Requests, once it is approved and past the 24hr threshold or so.

The self-assign functionality for issues could be a worthwhile option to mention in the maintenance guide for people with write access. At least I find this really useful for keeping track of my tasking - not sure how others feel about it.

This is actually a good idea. It helps people to keep track at https://github.com/pulls/assigned on what they have to do. How about we make it a habit to assign 2 people to each PR (the person who opened it, and one main reviewer). Other people can jump in and give (nice) unsolicited reviews of course, but it will help manage the chaos a bit better.

I guess my opinion is that specific reviewer requests are only needed for specialized PRs (e.g., early development on dvc, workflows) and that more general docs/code PRs don't require reviewer requests (especially since they seem to always get reviewed really fast anyways).

Yes, there are some highly specialized core backend stuff around the GitHub Actions workflow, GMT C API, etc, but I think we should still encourage and train people to review all parts of the PyGMT codebase. Sometimes it's just a matter of asking a question, why does this piece of magic work? This is necessary to make PyGMT more sustainable as a project.

willschlitzer commented 3 years ago

I think it would be a good idea to have some sort of delegation to individuals approving PRs. I know I'm hesitant to merge a PR (of the few I've done) without getting @weiji14 or @seisman to sign off on it first. I think it would be good to delineate the roles for the different maintainers; something like certain maintainers handle documentation-related PRs, others handle wrapping new functions, some handle backend changes, etc. I think providing "ownership" of a certain subset of the PRs is a good way to make more inexperienced contributors comfortable with who is reviewing their work, and also gives "ownership" to the maintainers for a certain set of PRs.

maxrjones commented 3 years ago

a good way to make more inexperienced contributors comfortable with who is reviewing their work

A solution here is to publish the list of people who are trusted by the PyGMT project with the privilege and responsibility of merging pull requests (i.e., maintainers and admins combined) - similar to the numpy team gallery. Anyone can review a pull request, but (at least one) of these people are needed to approve for merging.

maxrjones commented 3 years ago

Here's a couple examples of review guidelines for inspiration:

https://numpy.org/devdocs/dev/reviewer_guidelines.html https://tljh.jupyter.org/en/latest/contributing/code-review.html

Does anyone know of any other examples?

My opinion is that we are reaching the limits of what can be easily digestible in a markdown file and should consider sphinx for governance/maintenance documentation.

weiji14 commented 3 years ago

Lots of good ideas here! There seems to be two main themes flowing through this thread:

  1. What the contributor cares about: The Pull Request review process itself, which is the original topic of this issue, and this should be documented in CONTRIBUTING.md (or a more detailed page like the NumPy example, or as in Xarray).

  2. What the maintainer cares about: Who do we assign to do the code review? GitHub has a way of automating code review assignments which we can make use of, originally developed from https://pullpanda.com/assigner.

I suggest that we split up the task into these two different streams. What do people think? And who wants to be in charge of each sub-task?

core-man commented 3 years ago

The self-assign functionality for issues could be a worthwhile option to mention in the maintenance guide for people with write access. At least I find this really useful for keeping track of my tasking - not sure how others feel about it.

Woow~ I even don't know it. This is a really great functionality (Thanks @meghanrjones ). When I look through the PR and Issue list, I don't know which maintainer or reviewer is reviewing it until I click that PR/Issue. I think this can help PyGMT maintainers/contributors know which PR still needs help. For example, If a PR is assigned to @meghanrjones, we will know she is willing to be in charge of it (sounds like an editor of a scientific journal). Other maintainers can still help this PR, while they can also choose to spend their time for other PRs.

It will be great if this self-assign functionality and "needs review"/"final review call" labels are combined. We will know who is in charge of a PR and if that PR needs more reviewers.

This is actually a good idea. It helps people to keep track at https://github.com/pulls/assigned on what they have to do. How about we make it a habit to assign 2 people to each PR (the person who opened it, and one main reviewer). Other people can jump in and give (nice) unsolicited reviews of course, but it will help manage the chaos a bit better.

Sounds great.


I am skeptical that one rule could apply to all PRs. For example, I would prefer to see more opportunity for people to comment if #379 gets a 'final review call' (order days) since it's a outward facing change, while I would really be fine if #1110 were merged immediately after one approval.

I agree with @meghanrjones. Some simple PRs can be merged if one maintainer approves it and think they don't need additional review.

I think it would be a good idea to have some sort of delegation to individuals approving PRs. I know I'm hesitant to merge a PR (of the few I've done) without getting @weiji14 or @seisman to sign off on it first. I think it would be good to delineate the roles for the different maintainers; something like certain maintainers handle documentation-related PRs, others handle wrapping new functions, some handle backend changes, etc. I think providing "ownership" of a certain subset of the PRs is a good way to make more inexperienced contributors comfortable with who is reviewing their work, and also gives "ownership" to the maintainers for a certain set of PRs.

I think all the current PyGMT maintainers know if one PR can be merged without one more approval. If the maintainer in charge of one PR wants an additional review, they now can use "needs review" or "final review call" labels.

core-man commented 3 years ago

I suggest that we split up the task into these two different streams. What do people think? And who wants to be in charge of each sub-task?

Sounds great~

  1. What the contributor cares about: The Pull Request review process itself, which is the original topic of this issue, and this should be documented in CONTRIBUTING.md (or a more detailed page like the NumPy example, or as in Xarray).

I am still a little confused after I submit a PR. They are also directly related to the above topic about maintainer assignment.

  1. Whether shall I request one or more reviewers? If yes, who (usually there are some suggested reviewers on the right sidebar)? Currently, I will let the PR as a draft and wait, because some maintainers/contributors always are willing to help without request :smile: :smile: :smile:
  2. Should I always make a PR as a draft until it is ready for the final review? Some maintainers will help make it as a draft to save ci. So currently, I will make it a draft myself.
weiji14 commented 3 years ago
  1. What the contributor cares about: The Pull Request review process itself, which is the original topic of this issue, and this should be documented in CONTRIBUTING.md (or a more detailed page like the NumPy example, or as in Xarray).

I am still a little confused after I submit a PR. They are also directly related to the above topic about maintainer assignment.

1. Whether shall I request one or more reviewers? If yes, who (usually there are some suggested reviewers on the right sidebar)? Currently, I will let the PR as a draft and wait, because some maintainers/contributors always are willing to help without request smile smile smile

2. Should I always make a PR as a draft until it is ready for the final review? Some maintainers will help make it as a draft to save ci. So currently, I will make it a draft myself.

@core-man, how about you tackle this 1st task with me (since you're a bit newer and probably closer to this). I'll open up a PR to start (and assign the two of us). Just found a really good resource https://github.com/joho/awesome-code-review, but let's discuss more details in the PR itself (and others are welcome to chip in too).

On the 2nd task, I think @meghanrjones and @seisman should handle it because GitHub's code review assignment settings (https://docs.github.com/en/github/setting-up-and-managing-organizations-and-teams/managing-code-review-assignment-for-your-team) seem to require admin permissions to the GenericMappingTools organization to set up. And I know Meghan had some thoughts on who gets to review what at https://github.com/GenericMappingTools/gmt/discussions/4732#discussioncomment-481109.

maxrjones commented 3 years ago
  1. What the contributor cares about: The Pull Request review process itself, which is the original topic of this issue, and this should be documented in CONTRIBUTING.md (or a more detailed page like the NumPy example, or as in Xarray).

I am still a little confused after I submit a PR. They are also directly related to the above topic about maintainer assignment.

1. Whether shall I request one or more reviewers? If yes, who (usually there are some suggested reviewers on the right sidebar)? Currently, I will let the PR as a draft and wait, because some maintainers/contributors always are willing to help without request smile smile smile

2. Should I always make a PR as a draft until it is ready for the final review? Some maintainers will help make it as a draft to save ci. So currently, I will make it a draft myself.

@core-man, how about you tackle this 1st task with me (since you're a bit newer and probably closer to this). I'll open up a PR to start (and assign the two of us). Just found a really good resource https://github.com/joho/awesome-code-review, but let's discuss more details in the PR itself (and others are welcome to chip in too).

On the 2nd task, I think @meghanrjones and @seisman should handle it because GitHub's code review assignment settings (https://docs.github.com/en/github/setting-up-and-managing-organizations-and-teams/managing-code-review-assignment-for-your-team) seem to require admin permissions to the GenericMappingTools organization to set up. And I know Meghan had some thoughts on who gets to review what at GenericMappingTools/gmt#4732 (reply in thread).

I'm not really clear on what the second task is. Is it to set up automated code review assignments? Are we sure that's helpful here? We don't have all that many maintainers and the amount of time that any given maintainer has available for PyGMT reviews could fluctuate without us or the routing algorithms knowing. My opinion is that maintainers should just assign themselves to PRs that they are wiling to review and take responsibility for, at least for now.

Thanks for sharing that resource - lots of good material there.

maxrjones commented 3 years ago

Ah, I see that these are probably the two tasks:

  1. What the contributor cares about: The Pull Request review process itself, which is the original topic of this issue, and this should be documented in CONTRIBUTING.md (or a more detailed page like the NumPy example, or as in Xarray).

  2. What the maintainer cares about: Who do we assign to do the code review? GitHub has a way of automating code review assignments which we can make use of, originally developed from https://pullpanda.com/assigner.

Sorry, I was a bit confused the quote of @core-man's two questions. If this is right, sounds great and I can work on the second with @seisman.

weiji14 commented 3 years ago

I'm not really clear on what the second task is. Is it to set up automated code review assignments? Are we sure that's helpful here? We don't have all that many maintainers and the amount of time that any given maintainer has available for PyGMT reviews could fluctuate without us or the routing algorithms knowing. My opinion is that maintainers should just assign themselves to PRs that they are wiling to review and take responsibility for, at least for now.

You're right, we're probably not at that stage yet with many (>10) maintainers. I think Will had a point in https://github.com/GenericMappingTools/pygmt/issues/1113#issuecomment-805275155 that we should have some sense of ownership on different parts of the code. But I'll leave it up to you to decide whether to use a manual, semi-automated or fully automated PR reviewer assignment method. Just need to document this assignment process somewhere, and feel free to turn MAINTENANCE.md into a fully fledged Sphinx page while you're at it.

maxrjones commented 3 years ago

I'm not really clear on what the second task is. Is it to set up automated code review assignments? Are we sure that's helpful here? We don't have all that many maintainers and the amount of time that any given maintainer has available for PyGMT reviews could fluctuate without us or the routing algorithms knowing. My opinion is that maintainers should just assign themselves to PRs that they are wiling to review and take responsibility for, at least for now.

You're right, we're probably not at that stage yet with many (>10) maintainers. I think Will had a point in #1113 (comment) that we should have some sense of ownership on different parts of the code. But I'll leave it up to you to decide whether to use a manual, semi-automated or fully automated PR reviewer assignment method. Just need to document this assignment process somewhere, and feel free to turn MAINTENANCE.md into a fully fledged Sphinx page while you're at it.

Sounds good. I could work on this in a couple days after finishing some core GMT stuff (I swear, we're close on 6.2), but if @seisman or anyone else wants to start on documenting this before then that is fine by me.

weiji14 commented 3 years ago

Sounds good. I could work on this in a couple days after finishing some core GMT stuff (I swear, we're close on 6.2), but if @seisman or anyone else wants to start on documenting this before then that is fine by me.

Yes, GMT 6.2 is the top priority, no pressure though.

core-man commented 3 years ago

@core-man, how about you tackle this 1st task with me (since you're a bit newer and probably closer to this). I'll open up a PR to start (and assign the two of us). Just found a really good resource https://github.com/joho/awesome-code-review, but let's discuss more details in the PR itself (and others are welcome to chip in too).

Okay~ :smile:

seisman commented 3 years ago

Lots of great discussions here!

I agree that assignment is a good idea.

I think Will had a point in #1113 (comment) that we should have some sense of ownership on different parts of the code. But I'll leave it up to you to decide whether to use a manual, semi-automated or fully automated PR reviewer assignment method.

I'm in favor of the manual PR reviewer assignment, based on maintainers' own availability and interests.

  • Should I always make a PR as a draft until it is ready for the final review? Some maintainers will help make it as a draft to save ci. So currently, I will make it a draft myself.

Yes, I once marked some PRs as drafts to save CI resources. I think it causes more confusion to contributors and maintainers. People should start a draft PR first, and can mark it as "ready for review" at any time.

michaelgrund commented 3 years ago

I followed the discussion only from the sideline so far, sorry. However, I agree with most given suggestions and answers to make the whole review process easier and more powerful. On the other side, I wouldn't cast everything in stone to allow at least some flexibility.