nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.74k stars 626 forks source link

License prevents scientific use #788

Closed klmr closed 6 years ago

klmr commented 6 years ago

The GPLv3 license creates some nontrivial problems for use of this code in scientific software. This might be reason enough to reconsider whether GPL is the best fit for this project, and whether its restrictions are intended by the authors, or merely accidental.

For reference, Titus C Brown has argued forcefully and influentially that copyleft licenses are non-open, and therefore impede Open Science.

In fact, Lior Pachter, one of the major proponents of non-free software in bioinformatics has subsequently conceded that he was wrong about the non-free licensing of the Kallisto software.

(This is related to, but distinct from #478.)

karinlag commented 6 years ago

Just wanted to :+1: this one.

pditommaso commented 6 years ago

Thanks for reporting this issue. While I personally think this problem is overemphasised we are aware this could represent an issue in some context.

The good news is that we are working to relax nextflow license to a more permissive one.

kblin commented 6 years ago

By that logic I could write a GPLed python interpreter and then force Python programs to be GPLed? I don't think this is how this works. Again, I'm not a lawyer, but my reading of https://www.gnu.org/licenses/gpl-faq.en.html#IfInterpreterIsGPL is that e.g. using JNI to load a GPLed library from your Java program is considered the same as linking to that GPLed library from a C program.

kblin commented 6 years ago

From the purpose of a nextflow pipeline, I'd consider the nextflow runtime a "system library", similar to the VS runtime libraries discussed in https://www.gnu.org/licenses/gpl-faq.en.html#WindowsRuntimeAndGPL

klmr commented 6 years ago

@kblin

I could write a GPLed python interpreter and then force Python programs to be GPLed?

No, unless those programs are linked to your library (licensing applies to concrete implementations, not to specifications). But if there’s only a single implementation, and that implementation is GPL’d, then the point is moot.

using JNI to load a GPLed library from your Java program is considered the same as linking to that GPLed library from a C program

Precisely. And both require the client code to be GPL licensed. This is the whole reason why libc (the GNU C library implementation) and libstdc++ (the C++ standard library implementation that ships with GCC) are not purely GPL licensed, and notably come with a runtime exception (despite their author, GNU, being the foremost proponent of the GPL).

I'd consider the nextflow runtime a "system library"

I wish that was how the GPL works, but FSF explicitly disagrees. Anyway, the system library exception only applies to compiled, non-open libraries. This is the opposite situation.

kblin commented 6 years ago

But before the script is run through the nextflow pipeline, it's just data. I really don't see how it's possible to claim that it's a derived work at that point.

kblin commented 6 years ago

Hm, ok, assuming you do call nextflow functions, yes, it's a derived work. It's just super unclear from the docs what parts are plain Groovy (Apache 2 licensed), and what parts are the runtime calls.

pditommaso commented 6 years ago

Java has the classpath exception exactly for that i.e. to allow third party applications to use runtime libraries without enforcing the GPL clause. https://softwareengineering.stackexchange.com/questions/119436/what-does-gpl-with-classpath-exception-mean-in-practice

delagoya commented 6 years ago

I agree that there is a lot of FUD around GPL, but in this case, you have well argued that the pipelines are indeed supposed to be GPL.

This rules out Nextflo usage at any biotech company. Often how they process their data is a core piece of their value proposition.

fstrozzi commented 6 years ago

I am working in the private sector and honestly the GPL was not an issue when we decided to use Nextflow and to create and run our workflows with it. A move to a more permissive license will be anyway very good and appreciated and I believe it will be extremely beneficial for the project in terms of adoption and contributions from a wider community.

kblin commented 6 years ago

As long as you don't distribute the workflows, the license doesn't matter anyway. :smile: https://www.gnu.org/licenses/gpl-faq.en.html#GPLRequireSourcePostedPublic

delagoya commented 6 years ago

Fair point. People do make decisions on perceived, rather than actual, risk though. Many companies have a more stringent review process for anything that involves GPL, and that by itself will dampen interest in NF.

pditommaso commented 6 years ago

As long as you don't distribute the workflows, the license doesn't matter anyway

Exactly.

People do make decisions on perceived, rather than actual, risk though

True. The point is the GPL is perceived as problematic, and companies/institution try to avoid it.

denis-yuen commented 6 years ago

Wanted to :+1: this one as well.

We've spoken before, but just in case others are interested, our use case is we'd like to re-use some nextflow parsing code (as opposed to the actual workflows) for the Nextflow equivalent of a wdl4s. But oddly, we're constrained due to

Apache 2 software can therefore be included in GPLv3 projects, because the GPLv3 license accepts our software into GPLv3 works. However, GPLv3 software cannot be included in Apache projects. The licenses are incompatible in one direction only ... We avoid GPLv3 software because merely linking to it is considered by the GPLv3 authors to create a derivative work. We want to honor their license. Unless GPLv3 licensors relax this interpretation of their own license regarding linking, our licensing philosophies are fundamentally incompatible

https://www.apache.org/licenses/GPL-compatibility.html

stain commented 6 years ago

Perhaps as a minimum the exposed Nextflow functions (or their API stubs) should be under are more permissive license like Apache License 2.0, so that the workflow definitions themselves are not accidentally GPL-"infected" by derivation; just the "linked" version, which would only exist in memory during execution.

awz commented 6 years ago

If the goal is to permit use of Nextflow with proprietary / non-commercial / GPL-incompatible tools then a few things can happen.

1) The copyright holders of Nextflow can either release the entire project under a permissive license (Apache V2) or

2) A license exception can be offered that explicitly permits linking with non-GPL components or

3) GPLed portions of Nextflow can isolated from any user workflows/tools via a layer of Apache software so that no direct linking with GPLed software occurs.

We use option 3) in the Arvados project. @stain has made the same suggestion above.

pditommaso commented 6 years ago

As already mentioned it was already planned to change the license to a more permissive one.

One question I want to ask to you: is there any good argument regarding Apache 2.0 vs BSD-3 vs MIT? Pros and cons of each these licenses?

cjfields commented 6 years ago

@pditommaso I would look to the Titus Brown post that @klmr mentioned (along with the linked comments) as well as the follow up one from Lior Pachter about relicensing kallisto. I'm a big fan of BSD-3 personally.

pditommaso commented 6 years ago

I'm a big fan of BSD-3 personally.

The question is: why?

Update: here there's an interesting comparison.

stain commented 6 years ago

A nice feature of Apache License (sorry, I'm biased) is that it has patent-troll protection, just like GPL 3 - that means that consumers are safe from upstream coming running with a patent suit even if you though it was open source.

As contributions upstream are assumed to be Apache License the maintainer is also somewhat safe from reverse patent-troll attack from pull requests as well and that they are free to redistribute received code.

(Of course there could still be third-parties that own software patents you or the contributor unknowingly infringe, so totally safe you can never be..)

While AL2 is combinable with GPL 3 (the result is then fully GPL3), note that this patent license is the main reason why Apache License is not downward compatible with GPL2 software if the authors removed the "or (at your option) any later version" phrase - as then they can't be upgraded to GPL3. (You will usually have bigger problems with not being able to combine such programs with GPL3 code).

You may or may not like the NOTICE propagation of Apache License as it still applies to binary distros, this is to ensure your copyright attribution is propagated - but as an author you are free to not make such a NOTICE file in which case only copyright statements inside source files need to stay.

Here's how we use NOTICE in Apache Taverna - academics like to keep credit..

On the other hand MIT or BSD-licensed code can basically feed into any other open source project, but come with softer protections - it's mainly about indemnity and retaining copyright. Their licenses are popular because they are short and don't have many restrictions. Strangely projects using BSD or MIT licenses are not good at updating their (c) when they receive contributions, so it can become troublesome if your code is to be taken up by a commercial company as they can't easily verify the intellectual property rights of third-party contributions. The before-mentioned contribution bit of Apache license makes that easy.

awz commented 6 years ago

@stain offered many excellent points!

Bottom line for me is that the patent grant offered by Apache is strongly preferable to no patent grant in BSD or MIT.

pditommaso commented 6 years ago

Further investigating the GPL issue and possible alternative licenses I've encountered the Mozilla Public License 2.0 that I was nearly ignoring.

In a nutshell it is a copyleft licence (like GPL) with the major difference that it ONLY applies a file level not to all the program. This makes it extremely easier and less ambiguous than GPL. Modified source files are still required to be distributed as open source, but entirely new files may bear a licence of the adaptor’s choice.

In essence, the MPL was designed as a “middle-ground” license sitting between the extremes of restrictive licenses like the GNU (A)GPL and liberal licenses such as the Apache / BSD / MIT. While the MPL license itself allows derivative works to be released under any license, including a proprietary one, it still requires a form of copyleft at the file level. This consequently puts the MPL in the same bucket as the Eclipse Public License and the GNU LGPL.

In practice, you may take some MPL code and use it as part of your proprietary software. However, you must give access in source form to the MPL-covered parts, and any modification of a piece of MPL-covered code from your side must also be published as MPL-covered code. This is just simple copyleft, really.

I think this is an interesting compromise between GPL and permissive licenses such as MIT and BSD.

I was wondering if any of you have a more specific idea or any feedback about it?

Some links: https://julien.ponge.org/blog/mozilla-public-license-v2-a-good-middleground/ http://veldstra.org/2016/12/09/you-should-choose-mpl2-for-your-opensource-project.html http://oss-watch.ac.uk/resources/mpl2

karinlag commented 6 years ago

Not a licensing expert here, but I don't really care what license Nextflow is under, as long as a. I can use it for my day job in a public health institution, and b. that the license that Nextflow is under doesn't dictate what licenses I can use for the code that I write.

Thus, from what I can see, if your legal dept is happy with this, I see no reason to go that way.

apeltzer commented 6 years ago

I think having a licence that permits usage for non-academic companies is first of all a good idea. Nextflow will benefit significantly, if there are possibilities for companies to back up the work being done here, especially when it comes down to "just" usage of their own written pipelines. Contributing something to core-nextflow is something different and should still be possible when following your suggestion of MPL here.

Not a licence-lawyer either, but I think this could be a good way to accomodate both academic users and would still allow contributions from e.g. companies.

What I like most about Nextflow here is, that you are actively thinking about solutions for everyone here and not just pretend the "problem" isn't there at all! Kudos for that!

karinlag commented 6 years ago

+1 to @apeltzer 's comment here. You are taking our concerns seriously, and I am very grateful for that.

pditommaso commented 6 years ago

Karin, Alex, thanks for you feedback and I completely agree with you, the license should not dictate what license you should use for your code and it should not prevent or restrict commercial usage.

MPL 2.0 may be a good choice, however is a relatively new license (10 years or so) and it's not very known. For this reason I was wondering if any of you have a direct experience with it or has any idea how it's perceived in profit organisations.

awz commented 6 years ago

You could also try LGPL v3 or stick with the existing license and add a license exception. Or go more permissive with something like Apache 2.

MPL has the disadvantage that it is not even one way compatible with GPL AFAIK.

Since your software is currently GPL there is a non zero possibility that some users would fork the project at the last GPL version. (Although I'd expect people would contact you before doing that.)

Have you considered talking to folks at the Software Freedom Conservancy? https://sfconservancy.org

pditommaso commented 6 years ago

The main issue of (L)GPL is that uses ambiguous language that opens a range of bizantine problems and incompatibilities, this is the main reason why companies try to avoid, and for the same reason I think it would make sense to switch NF to a different license. Also it must be noted that LGPL is incompatible with Apache 2.0 even in binary form.

Apache 2.0 is instead surely a candidate that we are taking in serious consideration.

MPL has the disadvantage that it is not even one way compatible with GPL AFAIK.

I think this was true with MPL 1.1, but MPL 2.0 is compatible with Apache 2.0. The FAQ says it's possibile to combine MPL code with BSD and Apache in the same executable program.

But including MPL 2.0 into an Apache 2.0 licensed program should still not be possible as stated here, tho I'm not clearly understanding what is preventing this in the Apache license.

Since your software is currently GPL there is a non zero possibility that some users would fork the project at the last GPL version.

Yes, this is a possibility, but it would be interpreted as an hostile fork.

Have you considered talking to folks at the Software Freedom Conservancy?

Yes, but this is not a priority in the short run.

karinlag commented 6 years ago

@pditommaso , I think most of us would want you to choose the most open license that your legal dept will let you employ, to put it like that :stuck_out_tongue:

pditommaso commented 6 years ago

Happy to announce that nextflow has moves to Apache 2.0 as of version 18.10.

cjfields commented 6 years ago

Awesome, thanks for the update @pditommaso !

pditommaso commented 6 years ago

It was due! thanks to you for supporting this!