jaraco / skeleton

A generic project skeleton for Python projects.
http://blog.jaraco.com/skeleton/
MIT License
120 stars 38 forks source link

The attribution problem solution #102

Open webknjaz opened 9 months ago

webknjaz commented 9 months ago

https://blog.jaraco.com/skeleton/#history-is-forever suggests that the attribution problem is unsolvable but that's only partially true. While it's not possible to preserve the authorship of each line on squash, it's possible to retrieve the original authors before doing this and list them via the Co-Authored-By trailers. This would attribute that large commit to a whole lot of people. It's also possible to use tags instead of branches for archiving (but that's not what I'm going to talk about here).

Several years ago we were migrating a bunch of content out of ansible-core into separate repos. We made a script to rewrite stuff and make those repos according to scenarios. I was also concerned about attribution and researched retrieving the authors. My patch didn't go in, because it was last-minute. But the PR is still around and could be helpful to others facing a similar problem: https://github.com/ansible-community/collection_migration/pull/497. Later on, I also suggested this solution to a Sphinx extension: https://github.com/sphinx-contrib/spelling/issues/60#issuecomment-665287993.

TL;DR here's how one can extract such a list:

$ git log '--pretty=format:%(trailers:key=Co-Authored-By,separator=%x0A)%x0A%an <%ae>%x0A%cn <%ce>' | sort | uniq

You can then stick everything into the commit message and have GitHub link the commit to all those people.