scikit-learn / enhancement_proposals

Enhancement proposals for scikit-learn: structured discussions and rational for large additions and modifications
https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
47 stars 34 forks source link

slep000, slep workflow #30

Closed adrinjalali closed 2 years ago

adrinjalali commented 4 years ago

Triggered by the discussions we had during the Paris sprint, and and the recent discussions under https://github.com/scikit-learn/enhancement_proposals/pull/17 , I think it's time for us to have a common understanding of how a SLEP workflow should look like.

This PR heavily borrows (and copy/pastes) from PEP1 and NEP0, and adjusts where appropriate.

I believe it's a good start for us to have an easier and more enjoyable experience working on SLEPs.

cc @scikit-learn/core-devs

adrinjalali commented 4 years ago

I think this also supersedes #28.

NicolasHug commented 4 years ago

I sort of have the same comment as Joel on https://github.com/scikit-learn/enhancement_proposals/pull/28#pullrequestreview-350833793, i.e. this seems to be addressing non-issues, for the most part.

The main points seem to be:

We can probably make a much shorter SLEP that addresses these 2?

adrinjalali commented 4 years ago

@glemaitre

Once a SLEP is merged and we asked to the mailing list to vote by linking to a PR changing the status to "Accepted", we might still need to have this SLEP opened to comment in the PR.

I think people could talk about the rest of it still, and if changes are made, then it'd be on the diff. But this is due to the fact that other communities have those discussions on the mailing list, and we're having it on the PR. Numpy for instance, just says "hey, we're accepting this, unless somebody seriously objects in 7 days". We may be having too many discussions compared to other communities :P

adrinjalali commented 4 years ago

I sort of have the same comment as Joel on #28 (review), i.e. this seems to be addressing non-issues, for the most part.

It may be a non-issue for some, but it is for me, and I understand from my conversations with @glemaitre it is for him as well.

We can probably make a much shorter SLEP that addresses these 2?

you can try, this is the best/shortest I could do.

NicolasHug commented 4 years ago

I have two main concerns about the current proposal.

  1. SLEPs are very rarely perfect from the start, and there is often a significant room for improvement on the first draft(s). Things like typos, ambiguous woring, or basic re-structuration for clarity are very common, even after a few iterations (precisely because the document keeps beeing edited). Merging a draft eagerly will put the responsibility on the reviewers to address such (possibly trivial but important) comments by opening PRs. I believe this is instead something that is the PR author's responsibility (see below).

  2. The current proposal leaves the door open for a SLEP author to basically say "I {don't want to, don't feel like, don't know how to} address this comment, you can just open a PR yourself.". I believe this to be detrimental to both productivity and, in some contexts, to mutual trust. This is, in a way, dismissing a reviewer's point of view/concern, and can be perceived as "my time is more valuable than yours". I have seen this kind of comments being made a few times lately, in the SLEP repo and in the main repo. (I can provide refs if needed.) I hope we don't go further in that direction.

The common denominator to both points here is that our bottleneck is reviewer time, not developer time. I fear that this proposal will put more responsibility on reviewers than necessary.


I have the feeling that the discussion so far has been rather defensive (from all sides), here and in other PRs. I'll try to be more vigilant in my comments so they don't get interpreted as such.

I do share the frustration around writing and reviewing SLEPs. I believe however that the key to a smoother process is timely feedback, from both reviewer and author sides. If this opinion is shared, I'd be happy to open discussion for potential workflows, e.g. in the next meeting.

adrinjalali commented 4 years ago

I think that a SLEP should be merged, in draft mode, when the author considers them as ready for discussion and at least on core-dev is willing to champion this merge

I agree. It's just hard to put it in words in a short way, and I think that's why NEP000 has it as such (which this SLEP borrows). PEP1 has a much clearer (and yet similar IMO) process:

The PEP editors review your PR for structure, formatting, and other errors. For a reST-formatted PEP, PEP 12 is provided as a template. It also provides a complete introduction to reST markup that is used in PEPs. Approval criteria are:

  • It sound and complete. The ideas must make technical sense. The editors do not consider whether they seem likely to be accepted.
  • The title accurately describes the content.
  • The PEP's language (spelling, grammar, sentence structure, etc.) and code style (examples should match PEP 8 & PEP 7) should be correct and conformant. The PEP will be checked for formatting (plain text or reStructuredText) by Travis CI, and will not be approved until this passes.

Editors are generally quite lenient about this initial review, expecting that problems will be corrected by the reviewing process. Note: Approval of the PEP is no guarantee that there are no embarrassing mistakes! Correctness is the responsibility of authors and reviewers, not the editors.

I think that discussions happening on the SLEP (whether it is on the mailing list or on a PR) should be consolidated in the SLEP as much as possible. "As much as possible" means that the text of the SLEP should be kept to a reasonable length: an expected reading speed is 200 words per minute, ideally the body of a SLEP should not take longer than 10mn to read.

Agreed. In my mind an issue related to a SLEP is only resolved if the resolution is included in the SLEP. So that would kinda automatically be included. We can include it in the SLEP if you think it'd help/is necessary.

amueller commented 4 years ago

looks good. I like the PEP wording and maybe would include a shortened version of it.

I'm not sure if we should include this here, but for normal PRs it's very rare that someone merges their own PR except for in trivial cases. Should we do the same for SLEPs?

adrinjalali commented 4 years ago

I tried to revise the merge criteria, let me know how it looks.

adrinjalali commented 4 years ago

@ogrisel would you like to merge then? seems like we could call a vote on this one.

NicolasHug commented 2 years ago

Before calling for a vote, I think it could be valuable for this SLEP to describe what the review process looks like once the initial PR of a SLEP has been merged. For example, what should reviewers do if they have a question or a comment about the current text?

Also, SLEP006 has followed the proposed process here and the initial PR was merged early. A year later, this SLEP is still not voted upon and there has been a lot of back and forth between the reviewers and the authors (e.g. https://github.com/scikit-learn/enhancement_proposals/pull/55). Perhaps the authors and reviewers of SLEP006 could provide feedback on what worked and what didn't, so as to better inform the content of SLEP000.

I'll start: https://github.com/scikit-learn/enhancement_proposals/issues/47 is an example of a process that did not work at all: I had some questions and comments about SLEP006, and since the inital PR was already merged there was nowhere to make such comments, so I decided to open an issue. This issue got forgotten about and almost all my questions were left unanswered. I put time and effort in the review. Being ignored didn't feel good. It would be great if we could provide explicit guidelines to avoid such scenarios.

adrinjalali commented 2 years ago

I personally had a good experience with SLEP006's process. It's true that we didn't get to your issue and questions @NicolasHug , and that's probably because a lot of context was lost in writing and the discussions got heated at times, and we kinda left it where it was. I think what worked with @lorentzenchr and @agramfort was that we had video calls and discussed issues. We could form such small workgroups faster and work on the details, and the SLEP champion can take the responsibility of making sure those happen to move things forward.

Otherwise, I'm very happy with the way discussions have been happening in isolated spaces and not everything is discussed in a single giant long PR.

thomasjpfan commented 2 years ago

When a draft gets merged, there will likely be unresolved issues in the review. For SLEP006, @NicolasHug opened an #47 to continue the discussion. For me, I did not look at the issue tracker for unresolved issues on SLEP006.

I think we can write out the discussion workflow:

  1. Merged Draft SLEP
  2. Author of SLEP immediately opens an issue for discussion and links to discussions that were not resolved.
  3. Reviewers can add links to discussions they think were not resolved.
  4. Smaller PRs to edit the SLEP as points on the issue gets resolved.

WDYT?

NicolasHug commented 2 years ago

the discussions got heated at times

I don't think this applies to SLEP006. The only comment I made on SLEP006 prior to opening #47 is https://github.com/scikit-learn/enhancement_proposals/pull/16#issuecomment-539004470, where I'm only asking how I can help.

From my perspective, my questions and comments weren't addressed simply because it's a pain for the SLEP author to address them in an issue like this, without the context of the text (it was a pain to write the issue as well). I surely don't blame you nor Joel for letting it slip through the cracks. This is a bad workflow, and unfortunately I believe that merging early will lead to such bad workflows.

An open PR, while imperfect, at least provides such context. There might be other options to explore though, but it'd be nice to know which ones we'd like to try.

  1. Author of SLEP immediately opens an issue for discussion and links to discussions that were not resolved.

For the reasons mentioned above, I don't think this will lead to pleasant workflows for reviewers and authors.

NicolasHug commented 2 years ago

One workflow that I experimented with in other repos when I needed to review some code that was already merged, is to create a [NOMRG] PR where I inlined my remarks as comments in the code. This kept the comments localized, and most (but not all) of the context was preserved.

Still not great, but better than opening an issue from my experience.

thomasjpfan commented 2 years ago

One workflow that I experimented with in other repos when I needed to review some code that was already merged, is to create a [NOMRG] PR where I inlined my remarks as comments in the code. This kept the comments localized, and most (but not all) of the context was preserved.

I see what you mean. Here is a second proposal:

  1. Merge Draft SLEP
  2. Author of SLEP immediately opens a follow up Pull Request with quotes and links to discussion points that are not resolved.
  3. Reviewers can add their questions and discussions to the new PR.
  4. End up with a more polished SLEP and merge.

Edit: I think this is what happened with SLEP006, I opened https://github.com/scikit-learn/enhancement_proposals/pull/55 to address all the concerns from the original PR and offline discussions and the discussion continued there.

NicolasHug commented 2 years ago

Instead of merging and opening a new PR, could we simply not merge, close the draft PR, and open a new fresh one?

I would be OK with this (modifications from your previous proposal in bold):

  1. Once the SLEP becomes reasonably mature or when the PR has too many comments to keep track of, close the PR.
  2. Author of SLEP immediately opens a follow up Pull Request with review comments containing quotes and links to discussion points that are not resolved. The content of that PR is the whole SLEP, exactly as it was closed in step 1.
  3. Reviewers can add their questions and discussions to the new PR.
  4. Repeat 1-3 as long as needed. End up with a more polished SLEP and merge.
adrinjalali commented 2 years ago

I pretty much opened this PR to make sure we merge the SLEP PRs as fast as we can. I had terrible experience working on a few SLEPs as the author or a co-authors which motivated me to check the PEP and NEP processes. Both of those processes merge the PR before it's ready for acceptance, and further discussion is supposed to happen on other channels, such as a forum, mailing list, or github issues.

I understand you don't want PRs to be merged before a consensus is there and all points are addressed, but that's exactly what's made me not work on SLEPs as I was before.

Also, I don't understand why this process would work for python and numpy, but not scikit-learn.

NicolasHug commented 2 years ago

I had terrible experience working on a few SLEPs as the author

Same. I very much share your frustration here. I have a full proper SLEP that took me days to write, and that I never submitted, because of the pain from previous experiences.

I also had a terrible experience as the reviewer with this new proposed workflow. That reviewer experience just plainly made me stop even caring about SLEP006 (and to be fair, about most other SLEPs as well).

thomasjpfan commented 2 years ago

Author of SLEP immediately opens a follow up Pull Request with review comments containing quotes and links to discussion points that are not resolved. The content of that PR is the whole SLEP, exactly as it was closed in step 1.

I can create a script to automate this migration. It will open the new PR and create the review comments with quotes and links to unresolved discussion points from the previous PR.

In the end, I want to make the SLEP process more enjoyable for everyone.

amueller commented 2 years ago

Maybe we could talk to the NEP / SLEP folks to see how they structure the discussion afterwards? I think it has mostly been on mailing lists, but when I was discussing SLEP 001 here, not sure anyone ever looked, I found the lack of inline citation annoying as well.

I think the main question is, if we merge early, how do we get from there to a place where we can vote on. Basically it's breaking up the SLEP authoring into three parts, pre merge, post merge, post vote, instead of just pre merge and post vote. Hopefully breaking up the process makes it more manageable, but it still means we need to have some idea of how to go from merge to vote.

lorentzenchr commented 2 years ago

I very much share your frustration here. I have a full proper SLEP that took me days to write, and that I never submitted, because of the pain from previous experiences.

@NicolasHug That's sad. I'm sure we would profit from your thoughts if you publish them, one way or the other.

We are misusing github's code development tools for design documents. I'm wondering if, in the end, it's a question about tools. For instance, I like the early merge logic simply because I like to read rendered text instead of source code. Maybe someone can tell from experience?

I hope we find a good solution we all can agree upon as SLEPs are important to bring innovation/improvements on a larger scale.

jnothman commented 2 years ago

Closing and reopening PRs, along with a rendering (or better something like reviewnb.com for Sphinx docs?) sounds like a reasonable solution to some of the issues here. But maybe the point is that we need to clarify the problems rather than using SLEP000 as a bandaid.

NicolasHug commented 2 years ago

I too prefer having the rendered content of the SLEP when reviewing, but I just build this repo's docs locally, it only takes a few seconds

I wouldn't mind having a dedicated section on the website for in-review SLEPs, as long as the open PR still contains the entire content of the SLEP.

glemaitre commented 2 years ago

I agree with both concerns and experience exposed by @adrinjalali and @NicolasHug and I personally do not have a process in mind.

We are misusing github's code development tools for design documents.

I am wondering if @lorentzenchr does not have point there and if a Google Doc with the capacity of having the rendered version, proposing some editing, adding some comments would not be a better tool. I can see that it miss the code part (but some extensions should be available there).

NicolasHug commented 2 years ago

LGTM let's vote

At least 3 core devs have expressed concerns about the current poposal:

There also has been quite a few suggestions of alternative workflows for moving forward, e.g.:

I'm really sorry but until these are addressed in some way, I don't think this SLEP is mature enough to be voted upon.

GaelVaroquaux commented 2 years ago

I'm really sorry but until these are addressed in some way, I don't think this SLEP is mature enough to be voted upon.

The feeling at the call yesterday was in favor of voting.

NicolasHug commented 2 years ago

I wish I could have attended the meeting. Were any of the aforementioned concerns discussed? I could not find anything specific in the meeting notes.

lorentzenchr commented 2 years ago

Maybe, only maybe, we should dare calling for votes more often and see rejecting votes as opportunity for improvement. From my side, I can live with the current status of this SLEP. And we can later change based on further experience (law usually lags behind current developments).

@NicolasHug As Gael said, in the meeting yesterday, we had the feeling that we should vote without specific technical points being discussed (therefore, you won't find anything). All the points you list are valid concerns. If I were the champion of this slep, however, I would not know how to improve the current status based on those concerns.

thomasjpfan commented 2 years ago

After sleeping it over it, I think it is important to try to get consensus on this specific SLEP because it is about SLEP workflow. We can try @NicolasHug's suggestion at https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555 on this PR to see if it improves the workflow.

I can help finish up addressing the concerns there and then push for a vote. Does that sound reasonable?

adrinjalali commented 2 years ago

This PR already has a few approvals, and as we discussed on the meeting, that counts as seconds to my call to vote. Therefore I'm merging this PR, and will call for a vote on the mailing list with an accompanying PR where we vote. As the author of the SLEP, I'm strongly opposed to the changes suggested in https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555, and in fact, as already explained, moving away from that process is the main reason I drafted this SLEP in the first place.

I'm happy to have improvements to this SLEP later, which will definitely happen. As to the main spirit of the suggested process, I think since both PEP and NEP follow the same process, we should try to do the same. If it's worked for them, it can work for us.

NicolasHug commented 2 years ago

I'm strongly opposed to the changes suggested in #30 (comment), and in fact, as already explained, moving away from that process is the main reason I drafted this SLEP in the first place.

There's no "moving away" from the process proposed in https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555: we have never implemented this process before.

We have however implemented the process suggested in this SLEP already (despite it not being merged nor voted upon), and from a personal perspective I can't say that it has been a tremendous success.

Would you mind detailing precisely what parts of https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555 you are against? This SLEP currently describes why our "current" workflow is less than optimal, but I haven't seen any specific feedback on https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555.

As a meta comment, the way we're moving forward here makes me loose confidence in our consensus seeking process.

GaelVaroquaux commented 2 years ago

but I haven't seen any specific feedback on #30 (comment).

Some people have expressed disagreement on this comment.

the way we're moving forward here makes me loose confidence in our consensus seeking process.

Consensus sometimes requires compromise and accepting some things that we disagree with. I remember really struggling with such acceptance a few years ago, but now I feel that the group is stronger than the individual and if some decisions were really bad, many people in a talent group would see them as such. So a persisting disagreement probably reveals a grey zone.

NicolasHug commented 2 years ago

but I haven't seen any specific feedback on #30 (comment).

Some people have expressed disagreement on this comment.

I haven't been able to find anything specific on https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555. I've seen disagreement on our current process (which I share), but it is very different from what is proposed in https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555.

ogrisel commented 2 years ago

@NicolasHug feel free to open a PR to edit the SLEP with what you suggest in: https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555 to try to move forward and have a focused discussion on this.

NicolasHug commented 2 years ago

Thanks for the suggestion @ogrisel .

Before I do that, I would prefer hearing opposing thoughts on https://github.com/scikit-learn/enhancement_proposals/pull/30#issuecomment-982815555 first. There is little value in submitting a PR while we know there are strong disagreements that haven't been explicitly stated yet.

I will also wait until the end of the vote, as I'd prefer not doing this work for nothing.

ogrisel commented 2 years ago

Personally I think the SLEP in its current state is better than no SLEP at all. I don't have an strong feeling on how to improve the github commenting workflow on SLEPs. Maybe the close/reopen strategy can work, but I think that the author should take the time to summarize unresolved discussion points in the description of the new SLEP PR or as inline comments.

ogrisel commented 2 years ago

But also I think it's fine to close/reopen Draft SLEP PRs with a summary of the open discussion points without having to specify this process in details in SLEP 000.

NicolasHug commented 2 years ago

the way we're moving forward here makes me loose confidence in our consensus seeking process.

Consensus sometimes requires compromise and accepting some things that we disagree with. I remember really struggling with such acceptance a few years ago, but now I feel that the group is stronger than the individual and if some decisions were really bad, many people in a talent group would see them as such. So a persisting disagreement probably reveals a grey zone.

Gaël, I'm happy to compromise and I'm OK with accepting things that I disagree with. What is hard to swallow for me here is the lack of acknowledgement on the concerns we raised, and the lack of actionable feedback regarding the alternative proposals. We're just getting a rejection with no details, without being given any opportunity for improvement. A wise man once said make sure everyone feels heard. Sadly, I don't feel heard at all here. And the decision to merge this PR while it was still being actively discussed makes me believe that any potential future efforts will be met with the same indifference.

GaelVaroquaux commented 2 years ago

What is hard to swallow for me here is the lack of acknowledgement on the concerns we raised, and the lack of actionable feedback regarding the alternative proposals. We're just getting a rejection with no details, without being given any opportunity for improvement.

I don't have a stake in this game. My impression at the meeting yesterday was that people felt that the SLEP could be improved, but that it was already an improvement on the existing and wanted to call a vote for this reason.

I did not feel a judgement on the specific concerns, more a wish to iterate. Things take a long time in scikit-learn :)

Also, I feel bad as the flag-bearer here, given that I am trying to express what I understood yesterday.

GaelVaroquaux commented 2 years ago

One last comment: I also resonate with a variety of comments that have been made. However, I'm feeling that the benefits of trying to reach a decision exceed those of further improvements, because these further improvements can come on top.

I also think that with the current tooling there is not silver bullet.