Open aeiras opened 3 years ago
Hello @aeiras and @scottmarlow.
We found that issue 4-5 month ago when doing 9.1 changes to the tool. And added so all authors and co-authors (up to 10) for commits would be added to the list, even if their changes already are not there commits should be still in a tree. (these lines are where fix is)
If PR was properly merged/squashed those users will appear in a list because they will be in the authored list.
But this case is completely different issue, if you force remove commit from tree or do a soft-reset squash without adding authors for previous commits to footer as:
fix(JK-001): Squash commit title
Commit body
Co-Authored-By: John Doe <a@example.com>
Co-Authored-By: Monty Python <b@example.com>
It will not matter, it's an issue of removing commit history at that point. Good indication that history was tampered with is, PR merge commit should be in the commit list, if you remove some file it will just be another commit. If you revert PR merge it should be also another commit...
The only way I see this could happen is if it was force pushed with overwriting old commit history.
That's why master/main branches have force push protection by default, and it should be on any branches used for stable development.
Conclusion: Tool for committers is pulling all PR and commit authors (with commit co-authors). The only issue is if commit is improperly manually squashed (without co-author contribution), or commit history was forcibly rewritten, but it's more a question of development best practices and contribution, it is hard to fix through just having raw data. (there is no even indication it even existed in that branch)
I wrote a small test for this case with the same graphql code as we use in commit getting bash script. (you can test it on https://docs.github.com/en/graphql/overview/explorer) No @cesarhernandezgt there, but I also don't see that PR merge commit even squashed (or revert of it) in commit list...
query {
repository(name:"jakartaee-tck" owner:"scottmarlow") {
ref(qualifiedName: "tckrefactor_marlow") {
target {
... on Commit {
history(first: 100, since: "2021-08-23T00:00:00+00:00") {
pageInfo {
hasNextPage
endCursor
}
edges {
node {
commitUrl
authors(first:10) {
nodes {
... on GitActor {
user {
login,
name
}
}
}
}
}
}
}
}
}
}
}
}
P.S. PR with multi-author commits fix for a bash tool is https://github.com/jakartaee/jakarta.ee/pull/1194
P.P.S. another solution if it's pair programming or commit created by a group, you need to mention it in footer of commit. One co-author per line
fix(JK-001): Squash commit title
Commit body
Co-Authored-By: John Doe <a@example.com>
Co-Authored-By: Monty Python <b@example.com>
Then github will add co-author to that commit authors, with PR merge squash is even easier all commits authors are added to squash commit authors. (both github and gitlab support co-authored commits)
Thanks for the feedback!
https://github.com/scottmarlow/jakartaee-tck/commit/5366823f416fba58fc6995a46fb67e1ae80fd722 is a squashed change on my topic branch. It currently contains:
…ibm/jbatch/tck/pom.xml parent relative path fix from Cesar Hernandez, move tests to Maven folder structure, add toMaven.sh, add ant, ${project.groupId} correction, move lib package sources to runtime, eliminate cycles in runtime, move GlassFish porting kit implementation to folder that doesn't get built (same for some Persistence files), move com.sun.ts.lib.util source to libutil module
Signed-off-by: Scott Marlow <smarlow@redhat.com>
It sounds like the comment should be updated to include:
Co-Authored by: Cesar Hernandez <chernandez@tomitribe.com>
Note that Cesar's modification was deleted in a subsequent change that deleted the modification. Was it wrong of me to just squash his contribution away? I thought it was okay since I mentioned him in the comment but now I know how to correct the comment to ensure he gets credit.
I don't squash much anymore unless it is to clean up a very messy topic branch like it was in this case. I moved several files to multiple locations to try different things and feel that the unsquashed branch was very difficult to read due to that.
I'm not sure why but I get failures when I add Co-Authored by: Cesar Hernandez <chernandez@tomitribe.com>
during a rebase.
Must be related to E363: pattern uses more memory than 'maxmempattern'
failure I was ignoring, will figure that out.
All set now I think (although not sure why I had to force the push that just had a commit log message change).
So if you want to replace commit it is used with force push, better practice would be feature branches and PRs you can merge into branch and squash. If you work with PRs you don't need to do manual squashes (just do PR squash on merge).
One branch should be prod/main where you merge PRs work as features are finalized. Second dev/stage branch where you test fixes/features iterating over smaller chunks. And for each fix/feature branches so you create PRs to dev branch as needed.
Squashing is useful when you do PRs between branches then it combines all data and you will can easily revert one feature or another via PR revert. But whenever you merge or revert something you should never force push if you want previous commit data to be preserved. Even when something gets merged/reverted both commits ideally stay in commit tree.
Also I reviewed your commit it should be Co-Authored-By:
no spaces in tag and no blank line between it and Signed-off-by:
<type>(<scope>): <subject>
<BLANK LINE>
<body>
<BLANK LINE>
<footer>
Here a bit of info on commit structure, Ideally your commit message should be smth like:
feat: squash of multiple fixes
List of fixes:
- Issue 51
- turning TCK into a multi-dependency maven project
- src/com/ibm/jbatch/tck/pom.xml parent relative path fix from Cesar Hernandez
- move tests to Maven folder structure
- add toMaven.sh
- add ant
- ${project.groupId} correction
- move lib package sources to runtime
- eliminate cycles in runtime
- move GlassFish porting kit implementation to folder that doesn't get built (same for some Persistence files)
- move com.sun.ts.lib.util source to libutil module
Co-Authored-By: Cesar Hernandez <chernandez@tomitribe.com>
Signed-off-by: Scott Marlow <smarlow@redhat.com>
P.S. change as you see fit:
first line should be a conclusion title of what commit includes, ideally less than 50 characters
<blank line>
then commit body, providing all the data you want people to read
<blank line>
footer data providing signatures or additional automation data like Co-Authored-By and Signed-off-by, line by line no blank lines between
Finally when you do PR merge just do squash through UI, it should not have issues with author or any data preservation:
+1 on the format rules above + https://gist.github.com/develar/273e2eb938792cf5f86451fbac2bcd51
+1 on the branching ideas, we are mostly using master (still discussing a new name to be applied at end of Jakarta EE 10 release) branch for development of the next Jakarta EE release. We also started a refactoring
development branch for refactoring our project to be easier to understand in the future (perhaps will finish in time for Jakarta EE 11).
I pushed an update that shows:
commit 993e5fc4e8ad152509f67b23ab50ea204dd5ce0e (HEAD -> tckrefactor_marlow, origin/tckrefactor_marlow)
Author: Scott Marlow <smarlow@redhat.com>
Date: Mon Aug 23 13:23:53 2021 -0400
Issue 51: Turn TCK into a multi-dependency maven project
Co-Authored-By: Cesar Hernandez <chernandez@tomitribe.com>
Signed-off-by: Scott Marlow <smarlow@redhat.com>
I suppose the above doesn't really show how much each co-author did but sounds like the best way.
Perfect now even from UI you have all the authors of that commit:
There are lots of ways of do things, but if you do a squash it's really hard to provide more info on code separation, if you have a PR and you merged it with squash you could mention it in commit body, then people will see who have done what work because PR will still have all commits, but even if it's small change he still provided some input and any contribution should be attributed. You are still main author.
So ideally you do all commits you want separately on your branch and after you finish create single PR which you will squash onto final branch. And then you can use that final branch as you wish.
P.S. I've also tested graphql test code from 8 hours ago, it now also returns both contributors. So all is well, tool works, and now that commit has all the needed data.
During the Sept 8th TCK call, we discussed squash vs not squash and its impact on the Jakarta Releases Contributors' cards history that populates the release page cards.
Thanks to @scottmarlow, we know that TCK items might be impacted by the current clean up. [1] https://github.com/scottmarlow/jakartaee-tck/pull/3 [2] https://github.com/scottmarlow/jakartaee-tck/tree/tckrefactor_marlow_beforesquash [3] https://github.com/scottmarlow/jakartaee-tck/tree/tckrefactor_marlow
Andrii, the Triber behind the development of the contributor's tool, found out the follow Conclusion: Squash is not optimal to commit history with reference to the stats used for the tool.
Quoting Andrii: "the Tool for committers is pulling all PR and commit authors with co-authors. The only issue is if commit is improperly manually squashed (without co-author contribution), or commit history was forcibly rewritten, but it's more a question of development best practices and contribution, it is hard to fix through just having raw data. (there is no even indication it even existed in that branch)"
@Dexmaster, you are most welcomed to expand your findings and Q&A as needed.
Scott, questions, shoot!
Thanks for enabling this discussion :)