Center-for-Research-Libraries / crl-serials-validator

Validate bibliographic and holdings data for shared print.
GNU General Public License v3.0
0 stars 1 forks source link

Verify license compatibility with dependencies #17

Closed ryan-jacobs closed 2 years ago

ryan-jacobs commented 2 years ago

This relates to #5, but given that that issue is considering larger licensing best practices for CRL, it's best to address some project-specific considerations in a separate issue.

A strict interpretation of the GPL indicates that if we use any PyPl (pip) based dependencies that are GPL (but not LGPL) we need to be compatible with GPL. The legal considerations around this are not clear, and it may be safe to argue that because we are not distributing these dependencies (they are just references in requirements.txt, not included source code) we don't have to worry about this. However, the spirit of the GPL, and the Free Software Foundation, seem to say otherwise.

At least one dependency, urldecode (https://pypi.org/project/Unidecode/) is GPLv2. We should check if there are others or any other kinds of incompatibilities.

This may also mean that this project must be GPL independent of any outcomes of the larger CRL licensing strategies. We need to verify this.

nflorin commented 2 years ago

I hadn't considered this possibility. I did a fair amount of reading around on it, which meant looking at a lot of generally unsatisfactory posts on Stack Exchange.

The FSF believes that dynamically linking to GPL code creates a "derived work", meaning that the code that does the linking is also subject to the GPL.

However, "dynamic linking" comes from C and isn't a term used in the Python world. In our specific example, we are not distributing unidecode and are interacting with it only through its API. If our code has to be GPL because unidecode is, then wouldn't our code also have to be GPL if we interacted over the web with a remote API with a GPL license?

The Python community seems to be split on this, with somewhat more people coming down on the side of the GPL applying to Python libraries. I should also note that there apparently haven't been any legal cases on the subject, so who knows what the actual truth of the matter is.

nflorin commented 2 years ago

One other library we import (fuzzywuzzy) is GPLv2. The others are a mix of 2-clause BSD, Apache, and MIT.

Some of these dependencies themselves have dependencies. fuzzywuzzy depends on python-Levenshtein, which is GPLv2. requests has something like 8 dependencies that are outside of the standard library, and a couple of these themselves have dependencies. Unless I missed one, they're all on permissive licenses of some sort. I vote that we never worry about dependencies of dependencies, and I refuse to even consider dependencies of dependencies of dependencies.

nflorin commented 2 years ago

Last note is that we could simply drop the two dependencies if we are worried about them and don't want to go GPL . I can fake out unidecode's functions with string in the standard library, and fuzzywuzzy can be replaced with a rewrite of the Levenshtein distance algorithm.

nflorin commented 2 years ago

More reading, and I'm now convinced that the idea that we wouldn't need to license the code under the GPL if we call the GPL licensed libraries. Both licenses include the clause saying you can license dependent works in "either version 2 of the License, or (at your option) any later version." So we could use GPLv3 if we want.

ryan-jacobs commented 2 years ago

Yes, I agree that there does not seem to be a truly satisfactory answer out there on this. Your comparison of including a GPL'd code via package reference (requirements.txt) with the use of a GPL'd RESTful API is an interesting one. Apparently even that can be a subject of debate!

The fact that we don't technically share or distribute the GPL'd code, but do depend on it to make the project work, seems to imply a grey area, and no one seems to agree. That said, it does seem like using a GPLv2+ would be "safe" no matter what though, and could protect dependencies (at all levels) from unintended usages in theoretical distributed commercial products that depend on our work.

Last note is that we could simply drop the two dependencies if we are worried about them and don't want to go GPL

Hummm, I'd vote no on that as it seems against the spirit of sharing and collaboration (no to mention more work). If we were building something commercial and needed to satisfy some lawyers things might be different, but I think we can come up with a standard that's compatible with GPL'd stuff one way or another.

More reading, and I'm now convinced that the idea that we wouldn't need to license the code under the GPL if we call the GPL licensed libraries. Both licenses include the clause saying you can license dependent works in "either version 2 of the License, or (at your option) any later version." So we could use GPLv3 if we want.

Do you mean your convinced that a permissive license is ok with GPL'd dependencies in requirements.txt or just that GPLv3 is ok with a GPLv2 dependency?

nflorin commented 2 years ago

Do you mean your convinced that a permissive license is ok with GPL'd dependencies in requirements.txt or just that GPLv3 is ok with a GPLv2 dependency? I meant I'm convinced we could use GPLv3 with these GPLv2 dependencies.

I also don't think that re-writing some of the code to remove dependencies is a good idea, just wanted to note that it's possible if we make an executive decision to avoid GPL licensing.

nflorin commented 2 years ago

I'm marking this one as closed. Everything I've read suggests that having GPL dependencies requires a GPL license, even though no court case has proven that (yet).

In addition, I'm going to go ahead and add the GPLv3 license. It's the mainstream GPL-type license, and I don't think that anyone has brought up a reason not to use it.