idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.52k stars 375 forks source link

[ RFC ] Process for moving modules out of the `contrib` package. #2866

Open mattpolzin opened 1 year ago

mattpolzin commented 1 year ago

UPDATES (not part of proposal text):

  1. 2023-02-25 - The license requirement description for third party derivative work was updated based on RFC comments.

    Summary / Motivation

A while back, it was decided that the contrib package should be disbanded (via general consensus, conversation at an Idris Developer Meeting, and perhaps at least as importantly by Edwin). I won't get into the details of why in this document, but suffice to say that contrib represents a mix of modules that are (a) important enough to eventually land in base or (b) niche enough to not be worth maintaining as part of the core project. The second category does not make a module less useful (qualitatively) but it makes it less frequently of-use and possibly leaves it with fewer core project contributors who are knowledgable enough or motivated enough to maintain it.

Therefore, we can begin the process of moving modules out of contrib at will.

The rest of this proposal represents my personal thoughts and intuitions on a process for moving things out of contrib and I would love feedback from anyone interested in shaping that process.

The following wiki page will be a final repository for this information once some semblance of agreement has been reached: [TBD].

The proposal

Criteria & Destination

First, we know that all of the contrib package is intended to be disbanded. Some of it may end up in the base library, other parts of it may end up in packages hosted by the community. @eayus put a lot of work into mapping out proposed destinations (base vs. third party) in the wiki. Although much of that document is not worded very strongly (i.e. "I think..." rather than "it is decided..."), it is the closest thing we have to consensus on destinations for modules at this point in time. The document was aired in Discord and as part of an Idris Developer Meeting -- I don't know personally if it was posted to the mailing list, but I hope it was. At any rate, I propose that this document be a strong indication of preference given the time others have had to voice opinions. I also think it is important that any contributor feels welcome to argue the opposite of what the document currently indicates in public channels (GitHub Isuee, Discord, Mailing List) -- the document is not set in stone.

The document leaves some things more up in the air than others, and I propose that anyone who wants to relocate one or more modules but feels uncertain at all about where to put those modules or how to structure a new package to contain those modules should feel free and encouraged to engage the community in further strategizing.

TLDR; If you agree with the wiki about a module you would like to relocate, proceed with your own interpretation of the best relocation and feel free and encouraged to engage the community in further strategizing via GitHub Issues, Discord, or the Mailing List.

Process

For all contrib module extractions, it is highly recommended that you open a GitHub ticket on the Idris2 compiler project explaining what you are about to do; this adds visibility into your efforts and reduces the likelihood that more than one person is working on the same thing!

base Relocation Process

I propose that for modules being moved into the base package, a single Pull Request implements the following:

  1. The PR description includes reference to the Wiki referenced above and explains any further reasoning about the strategy taken for relocating the module.
  2. Code for the module(s) is removed from contrib.
  3. References to the module(s) are removed from the contrib .ipkg file.
  4. Code for the module(s) is added to base. The code is either nearly identical to the code from contrib or the PR description explains the reasoning behind differences. Differences with good motivation are perfectly within bounds.
  5. References to the module(s) are added to the base .ipkg file.
  6. A CHANGELOG entry explains the relocation.

Example: Data.List.HasLength.

Third-Party Relocation Process

First of all, I propose it be required that the relevant module(s) is/are ported to a new repository hosted under the GitHub organization https://github.com/idris-community. This requires reaching out to one of the organization maintainers to create the repository, but you (the person writing the new package) will be given administrative permission over the repository you are creating. Repositories can also be moved to the community group after creation. This allows the community to pick new maintainers for these projects over time even if the original author moves on without time or foresight to find a new maintainer themselves.

It is important to also say that anyone is free to create their own repositories that reproduce portions of the contrib package under their own control and with that there are no rules guiding where the repository is located; this process only applies when a person chooses to submit a PR that removes the relevant module(s) from the contrib package.

I propose that for modules being moved into third-party packages, the following process is followed:

  1. Relevant module(s) is/are ported to the new package prior to opening a PR to remove them from contrib.
  2. New packages must have at least a rudimentary README file, an .ipkg file.
  3. New packages must retain the license text found in the compiler project LICENSE file. The third party package may be offered under a reasonably permissive license (I propose that MIT, BSD-3, and Apache-2.0 all be acceptable) by specifying the new license text in addition to the aforementioned license. Essentially, the derivative work that is the third party package can be offered under a new license but must still include the upstream project license text as required by the Idris 2 LICENSE file.
  4. New packages must be listed on the Idris wiki Third Party Libraries for discoverability.
  5. It is strongly encouraged that new packages support 1 or more of the community package managers (pack and sirdi have both been actively developed recently, however pack is currently getting more use and active maintenance to the best of my knowledge).
  6. It is requested that the person porting modules makes an effort to mention the original authors of any ported code in code-comment or a README.
  7. Given the above, a new PR can be opened against the Idris2 repository that removes the relevant module(s) from the contrib package. This includes both the module(s) source code and any references from the contrib.ipkg file.
  8. This PR must have a description that includes reference to the Wiki referenced above and explains any further reasoning about the strategy taken for relocating the module.
  9. A CHANGELOG entry explains the relocation.

Acquiring an idris-community GitHub Repository

If you do not have the ability to create repositories under the idris-community organization, reach out to Matt Polzin via Discord (matt.polzin#2289) or email (matt.polzin@gmail.com) to request membership in the community.

stefan-hoeck commented 1 year ago

Thanks a lot for your suggestions @mattpolzin . Just a small note from my side: Every time we drop stuff from contrib we potentially break a lot of packages. I therefore suggest that we check the pack collection for the impact these changes will have and try to inform package maintainers beforehand. I can try and help with this.

mattpolzin commented 1 year ago

@stefan-hoeck yeah, you're right that these changes are very likely to break packages. Do you suggest that something new be slotted into the process or are you aiming for an informal awareness that packages included in pack should be considered and their authors given notice?

I know it was previously decided that we shouldn't run pack in Idris2 pull requests, but I wonder if we should consider an optional workflow that doesn't run generally but can be run by request (maybe a label could be added to the contrib extraction PRs that prompted pack to run against that branch).

stefan-hoeck commented 1 year ago

@stefan-hoeck yeah, you're right that these changes are very likely to break packages. Do you suggest that something new be slotted into the process or are you aiming for an informal awareness that packages included in pack should be considered and their authors given notice?

I was aiming for informal awareness and hinting at the possibility that we/I could at least test the pack collection manually in case of deletions from contrib that don't go to base. On the other hand, I still think it would be useful to run certain PRs against the pack collection in general. Perhaps something triggered by a label as you suggest? If I understand correctly, labels can only be set by the project maintainers, right? At least, I never figured out how I could set labels for my own PRs myself.

mattpolzin commented 1 year ago

If I understand correctly, labels can only be set by the project maintainers, right?

That's my understanding as well. I've always found it confusing, but I suppose there is some logic in protecting the labeling system -- maybe all the more so if a label triggers something in CI.

falsifian commented 1 year ago
2. New packages must have at least a rudimentary README file, an `.ipkg` file, and a reasonably permissive license (I propose that MIT, BSD-3, and Apache-2.0 all be acceptable).

Isn't the code in contrib already copyrighted and available under the license in ./LICENSE? IANAL but I don't think you can just take code someone else wrote and claim it's now available under a different license.

I guess I'm proposing to remove the text about licenses above. It should be implicit that if you're copying the code you need to either keep the same license, or make it clear that your new choice of license only applies to new code added after the move.

First of all, I propose it be required that the relevant module(s) is/are ported to a new repository hosted under the GitHub organization https://github.com/idris-community. This requires reaching out to one of the organization maintainers to create the repository, but you (the person writing the new package) will be given administrative permission over the repository you are creating. Repositories can also be moved to the community group after creation. This allows the community to pick new maintainers for these projects over time even if the original author moves on without time or foresight to find a new maintainer themselves.

It is important to also say that anyone is free to create their own repositories that reproduce portions of the contrib package under their own control and with that there are no rules guiding where the repository is located; this process only applies when a person chooses to submit a PR that removes the relevant module(s) from the contrib package.

What's the purpose of this requirement?

An alternative would be for the person who proposes to maintain it to simply decide where they want it; e.g. they might put it with their other git repos.

mattpolzin commented 1 year ago

Isn't the code in contrib already copyrighted and available under the license in ./LICENSE?

I think you’re right, that was an oversight. We should either require the same copyright or ask Edwin for permission to re-license the redistributed code. I’ll modify the above text to say it must use the same license (or include it’s text prior to a statement that new or heavily modified portions are under a different license, because that is allowed).

First of all, I propose it be required that the relevant module(s) is/are ported to a new repository hosted under the GitHub organization https://github.com/idris-community.

What's the purpose of this requirement?

This allows the community to pick new maintainers for these projects over time even if the original author moves on without time or foresight to find a new maintainer themselves.

I’ve been privy to numerous projects where the original maintainer moves on; that should be ok, but I don’t want it to make it hard for the community to continue using and maintaining this work. Also, offering ones time to help extract a portion of the contrib package should not imply that person is offering their time beyond extraction.

buzden commented 1 year ago

We should either require the same copyright or ask Edwin for permission to re-license the redistributed code.

IANAL, but BSD license allows arbitrary re-licensing preserving the original copyright notice (with no restrictions to adding new notices). So, these options are not exclusive, we can (and, actually, must) preserve existing copyright, and we can add new lines to notice mentioning re-licenser, covering years after movement from the original repository.

Surely, Edwin can cease his copyright and allow to mention only new copyright holder (say, the Idris community), but why bother, if new lines just can be added, since it's allowed by the current license?

By the way, the existing copyright notice is very old, I think it should be extended to the present year.

falsifian commented 1 year ago

We should either require the same copyright or ask Edwin for permission to re-license the redistributed code.

IANAL, but BSD license allows arbitrary re-licensing preserving the original copyright notice (with no restrictions to adding new notices). So, these options are not exclusive, we can (and, actually, must) preserve existing copyright, and we can add new lines to notice mentioning re-licenser, covering years after movement from the original repository.

That sounds right, except I'm pretty sure you need to keep around not just the copyright line but the whole original license text:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.

I don't think that's an obstacle to switching licenses, so long as the old license text is kept around to satisfy that condition. I don't know if the new license would apply to the old code or just the code added after the license changed.

I think the copyright holder(s) can agree to a new license if that helps. I don't know if it's just Edwin who holds the copyright, though; I see many contributers in the git log. Here's an example of a project maintainer asking all contributers to agree to a license change.

mattpolzin commented 1 year ago

@buzden @falsifian Please let me know if I have captured this (more correct) understanding of the license requirements:

New packages must retain the license text found in the compiler project LICENSE file. The third party package may be offered under a reasonably permissive license (I propose that MIT, BSD-3, and Apache-2.0 all be acceptable) by specifying the new license text in addition to the aforementioned license. Essentially, the derivative work that is the third party package can be offered under a new license but must still include the upstream project license text as required by the Idris 2 LICENSE file.

I am pretty sure that although some or most of code in the new repository may be the same as the code that originated in the Idris 2 project, the new repository itself is a derivative work (a new package, after all) and therefore there is no distinction needed between old code & new code within the new license as long as the Idris 2 license text is included.

falsifian commented 1 year ago

I have a small quibble, but at this point I'm far enough out of my depth that you shouldn't let it block you, and I have seen contradictory claims by others (presumably also not lawyers) on the Internet; see e.g. this post and follow-up comments which disagree. My quibble's only about how the new repo ought to to describe the licensing situation, not about whether or not things can be distributed.

(BTW, my vague sense is that it's a bit rude when forking a project to change licences, but I don't really know if that's true, and in any case this isn't forking. Still, I would default to not changing licences, for simplicity.)

My understanding is that if I fork someone else's (BSD-licensed, for example) project under a different licences (GPL, for example), my own licence only applies to stuff I wrote after the fork. Say the original project had just one file, frobnicate.py, and after the fork I added a function to frobnicate.py and wrote a new file, defrobnicate.py, from scratch. I do not own the copyright to the old code in frobnicate.py, so have no right to license it under the GPL. However, that's not a problem in practice: I can distribute the new source code, and I think the proper thing to do is explain that part of frobnicate.py is provided under the BSD licence, and the rest is provided under the GPL. As I understand it, it would be incorrect to say that the whole thing is available under the terms of the GPL.

Here is some possible replacement text:

If the maintainer wishes to switch to a different license, the new license must be reasonably permissive; I propose that MIT, BSD-3, and Apache-2.0 all be acceptable. Note that the original compiler project [LICENSE] (https://github.com/idris-lang/Idris2/blob/main/LICENSE) file must be included, since those terms apply to code written before the change of license.