python / devguide

The Python developer's guide
https://devguide.python.org/
Creative Commons Zero v1.0 Universal
1.86k stars 781 forks source link

Explain requirements to convert someone else's patch on b.p.o to a PR #195

Closed Mariatta closed 7 years ago

Mariatta commented 7 years ago

There's a note here that says: You can still upload a patch to bugs.python.org, but the new pull request workflow is preferred.

While patches can still be uploaded to b.p.o, in the end we still ask people to create a PR based on that patch anyway. Maybe this note should be removed?

brettcannon commented 7 years ago

We ask, but people can still say "no" so I think the comment is fine and should stay as we technically can still accept patches the old-fashioned way (which core devs bother with them is another question). It could be made stronger, though, by saying the new workflow "is strongly preferred".

ncoghlan commented 7 years ago

We don't want to make a GitHub account mandatory for contributions, so "post a patch to the issue tracker and ask someone else to submit it as a PR" is still a supported option.

The note should probably state that, and also request that when this is the case, folks explicitly grant permission on the issue tracker for others to take their patch and turn it into a PR - otherwise it's likely to linger indefinitely on the assumption that they're going to turn it into a PR themselves.

Mariatta commented 7 years ago

So what's the 'rule' about signing CLA when someone submitted a patch on the bug tracker, and later asked another person to prepare the PR? Both need to sign the CLA? Or just the original author? Thanks.

louisom commented 7 years ago

Here is a similar situation to @Mariatta describe: In msg293523 Ben says that he/she will not using GitHub because of GitHub's ToS. And brainmay ship his patch from b.p.o to GitHub here, which GitHub consider is a signed-off work (there is two icon above commit history).

DimitrisJim commented 7 years ago

My guess would be they both sign the CLA, that seems like the safest and least complicating route.

ncoghlan commented 7 years ago

Copyright (if any) would accrue to the original patch author, so the main requirement would be for them to have signed it.

However, my guess is that anyone keen enough to be transcribing other people's patches into PRs will also have signed the CLA, so a "both should have signed it" requirement is likely to be effectively the same in practice.

brettcannon commented 7 years ago

The trick with this is the provenance of the code is a mess to track. In the case of python/cpython#30181 it's gotten complicated as @brianmay was nice enough to make sure that Ben Finney got attribution, but then Ben hasn't tied his GitHub account to bugs.python.org and so the CLA bot is saying the code is covered by the CLA (if someone inspects the email then it might match up, but then again that is easily spooked).

But what if Brian had not given Ben attribution? Then would Brian be assuming legal responsibility for the code? But if Brian says in the PR "this code is from Ben Finney" can we take his word for it, check Ben's status with the CLA, and use that as our legal standing? Is that any different then now when people take an old patch, touch it up, and say "here is an updated patch"?

IANAL and thus I'm not comfortable trying to answer any of these questions authoritatively. Perhaps it's safest to simply say that if a patch is uploaded to bugs.python.org then it needs to stay there and we ask people to not open a PR on behalf of someone else who has disappeared or refuses to use GH? In that instance the most useful thing someone could do is review the code and run the test suite to make sure the patch still works.

If @VanL wants to weigh in on this he can (but I also don't know if Van notices GitHub notifications so someone might need to explicitly email him if they want an answer).

terryjreedy commented 7 years ago

The 'confusion' urls are https://bugs.python.org/issue30181 and
https://github.com/python/cpython/pull/1505. The unique messiness is due to someone both 'rejecting' git and using git instead of the tracker to submit PR changes.

Attributions have always been messy because a) commits to the repository typically have multiple authors and b) only a few authors can commit, and c) as far as SVN and Hg were concerned, all lines were credited to the committer (as on an annotated file display). Any disentanglement of a commit depended on manual review of patches attached to an issue, aided by a commit message attribution.

With Git, a) and b) still apply; I am not sure about c). The big difference is squash merging PRs that may have individually attributed comments and commits, rather than single commits with less visible authorship.

I don't understand patches 'staying on bpo'. Once uploaded, they are contributed and we may do as we want, and we now require contributions to go through a PR. Some things we might request or require for transferred patches are:

  1. A standard message header: "bpo-NNNNN patch 'title' from name(username)." The Knight could parse that and check bpo to see if username has signed the CLA.
  2. Avoid mixing authorship by initially committing the unaltered bpo patch. If the commit message were the header above, could the Knight 'see' it?
  3. Limit transfers to committers. The rationale would be that committers are a) expected to be pushing other peoples work (sometimes), and b) more likely to follow picky rules ;-).
brettcannon commented 7 years ago

My "staying on bpo" comment is the idea that only core developers convert patches on bugs.python.org into PRs so we can more clearly track where the code originated from as I inherently trust core devs to know where the code they commit came from versus someone else who has not signed the CLA (this is obviously different than the specific case we're talking about at this point).

terryjreedy commented 7 years ago

So on point 3, I was agreeing without knowing it. The impetus for my comments was https://bugs.python.org/issue21261, with uploaded issue21261.patch by Eduardo Seabra, and https://github.com/python/cpython/pull/1511, by Louie Lu. The initial commit in the PR, de42b33, is a lightly edited version of Eduardo's patch with no credit to the latter. Since it fails all three of my points, I may close it and start over.

Since transferring patches to PRs was discussed on core-mentorship, with instructions for using patch.exe, perhaps you should post a request that non-coredev contributors stop doing this.

louisom commented 7 years ago

So if there have an attached patch on the issue, what kind of edit from this patch should be considered as lightly edit or not lightly edit?

For hypothesis, if someone didn't open or look at the patch on b.p.o, and then sent PR to GitHub which has only slightly different between patch on b.p.o, should it consider as a shipment of the original patch to GitHub?

Mariatta commented 7 years ago

Thanks everyone. Sorry if this question becomes more complicated. I'm fine with mentioning that GitHub PR is strongly preferred in the docs for now.

It seems like we need to clarify and document the workflow of handling b.p.o patches that are not our own, but it's a separate issue from this. I'll ask at core-workflow for further clarification of exactly what the workflow is, and then we can write up the guideline.

Thanks :)

terryjreedy commented 7 years ago

@Mariatta We find complications by experience ;-). We definitely need another issue.

@lulouie Any edit is at least a light edit, and I believe that the initial change to a fresh new branch should be the application of the downloaded patch, followed immediately by a commit with credit in the commit message (see my previous message). There should be no difference in the code, only in the headers. Looking at the patch should be irrelevant. Let's continue this elsewhere.

brettcannon commented 7 years ago

I talked with our lawyer and he said that as long as the long submitter to GitHub has signed the CLA and we verify at least on bugs.python.org that the original creator of the patch signed the CLA then we're fine (we don't have to do any crazy code checking, just make sure everyone has signed the CLA who has participated and may have written code).

Mariatta commented 7 years ago

Thanks for looking into this @brettcannon

So in conclusion, these are what need to be documented:

brettcannon commented 7 years ago

@Mariatta yep, that sounds right! People should also mention in the PR that the patch is from someone else so we can quickly check in b.p.o that the original patch creator has signed the CLA.

Mariatta commented 7 years ago

I don't want this to linger on forever so I will work on this later today.