polyactis / Accucopy

Accucopy is a computational method that infers Allele-Specific Copy Number alterations from low-coverage low-purity tumor sequencing data.
https://www.yfish.org/software/Accucopy
GNU General Public License v3.0
15 stars 4 forks source link

Licensing issues #10

Open mbargull opened 3 years ago

mbargull commented 3 years ago

Hi @polyactis,

we considered to package Accucopy for Bioconda to make it easily obtainable for end users. (When or if this might happen depends on demand, required effort, and chiefly on the outcome of this issue.)

I noticed a few licensing issues which would prevent a distribution, though:

Unrelated to our interest in redistributing Accucopy via Bioconda, I'd like to ask you to review the license conformance of your software with its dependencies. (Considering the nature of the GPL, this would probably mean you would have to relicense Accucopy under the GPL 3.0 and as such have to drop the academic/non-commercial and other restrictions.) I'm not a lawyer and as such cannot offer further help with this issue, but just wanted to make you aware of it.

Cheers, Marcel

(cc @dlaehnemann)

polyactis commented 3 years ago

Dear Marcel,

Are there people using Accucopy? And what kind? Did you guys receive any request to package it?

It is our institute license. They may not be aware of the conflicting issues. If we remove GADA or do not package strelka in our binary/container, will that be fine?

If BioConda is serving academia, I do not think our institute will have any problem with you packaging it into BioConda.

Thanks, Yu

-- Prof. Yu S. Huang Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, CAS http://www.yfish.org/ https://sites.google.com/site/polyactis/

On Mon, Mar 29, 2021 at 8:38 PM Marcel Bargull @.***> wrote:

Hi @polyactis https://github.com/polyactis,

we considered to package Accucopy for Bioconda https://bioconda.github.io/ to make it easily obtainable for end users. (When or if this might happen depends on demand, required effort, and chiefly on the outcome of this issue.)

I noticed a few licensing issues which would prevent a distribution, though:

Unrelated to our interest in redistributing Accucopy via Bioconda, I'd like to ask you to review the license conformance of your software with its dependencies. (Considering the nature of the GPL, this would probably mean you would have to relicense Accucopy under the GPL 3.0 and as such have to drop the academic/non-commercial and other restrictions.) I'm not a lawyer and as such cannot offer further help with this issue, but just wanted to make you aware of it.

Cheers, Marcel

(cc @dlaehnemann https://github.com/dlaehnemann)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2PFNQBTEUNZCZDHGUTTGB7F7ANCNFSM4Z7QH62Q .

dlaehnemann commented 3 years ago

Hi Yu,

many thanks for checking back!

I saw the Accucopy paper and would have liked to try it. But if the tool is to be useful to me in research and hospital analysis work, I need to be able to automatically install it and easily use it in a pipeline. For me, this means packaging the software on bioconda and creating a snakemake wrapper to be able to deploy it reproducibly in a snakemake workflow.

Bioconda has no "only for academic use" designation and this would not be enforceable in any way. So instead, it will only package software that freely allows redistribution. As mentioned by Marcel above, https://github.com/polyactis/Accucopy/blob/v1.1/LICENSE#L26 is pretty clear that any redistribution needs written permission from SIMM. Neither would I want to go through that for packaging something on bioconda myself, nor could it be enforced for every user who downloads the software via bioconda. For some illustration, GATK 3.x was a similar case---although a bit clearer on the licensing in directly providing a free-for non-academic designation. But the respective conda recipe had to come without executables and users had to manually acquire them through another way and then install them manually, confirming that they were conforming to the licensing agreement---this basically breaks any attempt at automating scientific analyses and was only done due to the eminent role of the software in the research community. But it was a big hassle for everybody that so many people complained about so consistently, that GATK eventually completely open-sourced the code after years of discussions when they released a new major version (4.0): https://gatk.broadinstitute.org/hc/en-us/articles/360045763652-Where-can-I-get-the-GATK-source-code-Is-it-open-source-

But this is not only a bioconda-related problem. There are more general considerations regarding licenses that are important for academic software. A good point to start reading up on this are the two blog posts mentioned at the start of this lengthy licensing discussion for a big academic software project. The first one provides a detailed argument for permissive open source licenses in academia. And the second one is a good show-case for the issues that arise with special licensing aimed at commercial use cases. The general conclusion is, that restrictive or just ambiguous licensing is always an impediment to adoption of software in academia. As a more widely used software usually means prestige for research institutions, maybe there is an angle to convince your institute to adopt a permissive open source license as a default?

Regarding your other questions: I think removing GADA would not be enough, as Marcel also mentions GSL as being under a GPL license. So I guess even with GADA removed, your institute's license would still be in violation of the licenses of included code. But I am no law expert either, and my main point would instead be, that there are overwhelmingly many reasons to adopt permissive open source licenses in academia, to avoid having to become a licensing expert or having to involve expensive lawyers... ;)

Best regards, David

polyactis commented 3 years ago

Hi David,

Let me talk to our IP department and get back to you.

Yu

Prof. Yu S. Huang Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, CAS http://www.yfish.org/ https://sites.google.com/site/polyactis/

On Wed, Apr 7, 2021 at 4:59 AM David Laehnemann @.***> wrote:

Hi Yu,

many thanks for checking back!

I saw the Accucopy paper and would have liked to try it. But if the tool is to be useful to me in research and hospital analysis work, I need to be able to automatically install it and easily use it in a pipeline. For me, this means packaging the software on bioconda https://bioconda.github.io/contributor/guidelines.html# and creating a snakemake wrapper https://snakemake-wrappers.readthedocs.io/en/stable/ to be able to deploy it reproducibly in a snakemake https://snakemake.readthedocs.io/en/latest/ workflow.

Bioconda has no "only for academic use" designation and this would not be enforceable in any way. So instead, it will only package software that freely allows redistribution. As mentioned by Marcel above, https://github.com/polyactis/Accucopy/blob/v1.1/LICENSE#L26 is pretty clear that any redistribution needs written permission from SIMM. Neither would I want to go through that for packaging something on bioconda myself, nor could it be enforced for every user who downloads the software via bioconda. For some illustration, GATK 3.x was a similar case---although a bit clearer on the licensing in directly providing a free-for non-academic designation. But the respective conda recipe had to come without executables and users had to manually acquire them through another way and then install them manually, confirming that they were conforming to the licensing agreement---this basically breaks any attempt at automating scientific analyses and was only done due to the eminent role of the software in the research community. But it was a big hassle for everybody that so many people complained about so consistently, that GATK eventually completely open-sourced the code after years of discussions when they released a new major version (4.0):

https://gatk.broadinstitute.org/hc/en-us/articles/360045763652-Where-can-I-get-the-GATK-source-code-Is-it-open-source-

But this is not only a bioconda-related problem. There are more general considerations regarding licenses that are important for academic software. A good point to start reading up on this are the two blog posts mentioned at the start of this lengthy licensing discussion for a big academic software project https://github.com/nextflow-io/nextflow/issues/788#issue-339767823. The first one http://ivory.idyll.org/blog/2015-on-licensing-in-bioinformatics.html provides a detailed argument for permissive open source licenses in academia. And the second one is a good show-case for the issues that arise with special licensing aimed at commercial use cases. The general conclusion is, that restrictive or just ambiguous licensing is always an impediment to adoption of software in academia. As a more widely used software usually means prestige for research institutions, maybe there is an angle to convince your institute to adopt a permissive open source license https://choosealicense.com/ as a default?

Regarding your other questions: I think removing GADA would not be enough, as Marcel also mentions GSL as being under a GPL license. So I guess even with GADA removed, your institute's license would still be in violation of the licenses of included code. But I am no law expert either, and my main point would instead be, that there are overwhelmingly many reasons to adopt permissive open source licenses in academia, to avoid having to become a licensing expert or having to involve expensive lawyers... ;)

Best regards, David

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/polyactis/Accucopy/issues/10#issuecomment-814436426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAF7C2PZWR2OGVTBFFVB2NTTHNYZNANCNFSM4Z7QH62Q .

polyactis commented 3 years ago

Is there a way that I can compile and package Accucopy into BioConda? A link to the instructions would help ...

dlaehnemann commented 3 years ago

Hi Yu,

the full instructions for how to contribute to Bioconda are here: https://bioconda.github.io/contributor/index.html

And the most important details, depending on programming language used, can be found in this sub-category: https://bioconda.github.io/contributor/guidelines.html

Any news on the license? Because even if you as the author do the packaging, I'm not sure if the BioConda project can legally distribute the resulting package with the current license. I simply don't know how this works, legally.

Best, David

mbargull commented 3 years ago

Hey Yu,

the links David provided should help you get set up :). If you open a pull request to add Accucopy to Bioconda's recipes please feel free to ping either of us! We also have a couple of recipes for Rust packages that also use CMake and such during their build; if you want to "compare notes", you could take a look at https://github.com/bioconda/bioconda-recipes/tree/master/recipes/fastqc-rs or https://github.com/bioconda/bioconda-recipes/tree/master/recipes/alevin-fry for example.

Then again, as noted by David, we'd need you to publish Accucopy licensable under the GPL 3.0 (as required by the GSL and GADA dependencies) since otherwise we (or anyone, including your institute, for that matter) are not able to legally distribute your software.

Cheers, Marcel

polyactis commented 2 years ago

The License is changed to GPL v3.

dlaehnemann commented 2 years ago

Very nice! Thanks a lot for following up on this and making Accucopy available under GPL v3! We'll try it out soon and might come back with questions.