jw3 / example-daffodil-vscode

A VS Code extension for DFDL with Daffodil
Apache License 2.0
2 stars 3 forks source link

Preserving authorship during move #77

Closed jw3 closed 2 years ago

jw3 commented 3 years ago

We need to consider preserving authorship as much as possible on the move to the Apache repo.

There is going to be an 'initial import' type code transfer from here to there, so structuring that initial import as multiple commits to the Apache repo is probably the way to go. Some initial commits that can be attributed individually might look like

  1. Initial structure of the extension
  2. Initial import(s) of various pieces of the extension
  3. Initial import of the dapodil backend

Authorship on 1, 2 is blended and can use co-authors and/or have multiple commits for each. On 3 it can be a single author.

It is important to think this through and retain authorship, or at least a rough approximation, when moving repos and flattening all the work history from the example repos to just a few commits.

mbeckerle commented 3 years ago

When daffodil became an Apache incubator project, the entire git history was preserved and brought over to its Apache repository intact.

I believe this will be a requirement actually. One should inspect the intellectual property things, banners etc. as part of the process of accepting/converting the software to ASF, and this is best done by starting from the repo "as is", i.e., NOT overwriting all the file headers with Apache banners yet, etc.

I am going to start an email thread on private and subsequently dev lists to discuss the cutover requirements for this code base.

jw3 commented 3 years ago

the entire git history was preserved and brought over to its Apache repository intact

I believe this will be a requirement actually.

Hmm, interesting thought.

So the history of this repo is that it is a detached fork of https://github.com/microsoft/vscode-mock-debug.git. That MS repo is MIT licensed and provided as an example of how a VS Code extension could be structured.

We forked and began filling in the blanks in this repo to form the Daffodil extension, and while doing so trimmed off some of the example code and other files that came with the example. Shortly after, I regretted having not just sketched a project from scratch, but we were moving along by that point and any change would have been too disruptive.

The solution envisioned was to sketch a new project at the Apache repo and transfer our code from this example repo, eventually archiving this project that contained the sample structure.

I think moving the entire repo is nice for this team, it exactly preserves the contributions. The primary question I have is would there be any licensing issue with us taking the commits that came before us in the example and moving them into the Apache repo (adding roughly ten previous contributors)?

Or is there some way to reference this project as the origin of the Apache project, thereby linking the original MIT licensed project? Or linking it directly as well?

mbeckerle commented 3 years ago

Probably we need to see what other VSCode extensions in Apache projects have done: E.g., https://cwiki.apache.org/confluence/display/NETBEANS/Apache+NetBeans+Extension+for+Visual+Studio+Code is one that is available from the VSCode marketplace. There are almost certainly others too.

mbeckerle commented 3 years ago

So commit 793e7e7e2cb4a303d6be865922dfb80afbca46bc is important. That's the last commit by isidor - a microsoft contributor. Presumably this was the state as of when this repo was created as a fork?

In theory that commit and all prior commits could be squashed together leaving a history of only the new contributions which would be rebased on top of the squashed microsoft material. This would change the commit ID however, so we need to keep track of 793e7e7e2cb4a303d6be865922dfb80afbca46bc in the commit comments, as the tag where we forked.

There seem to be exactly these contributors since that point: J Wass, S Dell, A Rosien, R Thomas, and J Mangus.

By my count, even if you took all consecutive commits by the same author and squashed them you'd still end up with 63 commits since the fork was created. That's far too much work to consider recreating this on the ASF repo as 63 pull requests.

So this will have to be done as a software grant.

The good news is that means all the existing commits can be preserved as is. Excepting maybe squashing all the pre-fork microsoft stuff together if we decide that's desirable.

mbeckerle commented 3 years ago

Another Apache project using vscode is: https://github.com/apache/openwhisk-vscode-extension

jw3 commented 3 years ago

In theory that commit and all prior commits could be squashed together

So keep that commit as the original source, but write it with one of us as the author and with the current date?

leaving a history of only the new contributions which would be rebased on top of the squashed microsoft material

Yes.

That's far too much work to consider recreating this on the ASF repo as 63 pull requests.

This issue was describing more of a summarized approach where we approximated our authorship as best we could in a few PRs. But I am liking the sound of your approach more.

The good news is that means all the existing commits can be preserved as is.

IDs will be rewritten, but otherwise the content is good.

So with all that said we will end up with an starting history in the vs-code repo that looks like

  1. Steve's initial commit
  2. Squashed MS commit
  3. ... Our approx 63 commits

Did I interpret that right?

mbeckerle commented 3 years ago

Roughly yeah. It may not be necessary to squash anything actually.

I'm still trying to figure out the review process for this contribution. Don't do anything yet. We'll take it up next week.