moodymudskipper / burglr

Copy Functions from Other Packages Without Adding Them As Dependencies
58 stars 5 forks source link

Licencing #3

Open jimr1603 opened 3 years ago

jimr1603 commented 3 years ago

Off the top of my head, I think we can throw a warning for the common licences as a prompt for what attribution is needed.

Then we can extend that to "here's what should be adequate attribution, do you want to include it?".

I'll try to get a PR for the former this long weekend.

moodymudskipper commented 3 years ago

Sounds good!

brodieG commented 3 years ago

Agree this is important, particularly if anything retrieved from this makes it into CRAN as per the policies:

The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright/license statements are preserved and authorship is not misrepresented.

moodymudskipper commented 3 years ago

Thanks @jimr1603 , this is a good start, here is the new version :

image

For the moment it only describes the license of the target package, in this example the package {scales} is used too.

I've been thinking of a few things, we could have one of these annoying ui prompts at the end :

 tcltk::tkmessageBox(message = "functions from the following packages have been copied :\n\n...pkg info, authors, license...\n\nThis info has been copied to LICENSES.md.\n\nIllegal use might put you or you company at risk. Use responsibly!\n\nSee `vignette('licensing')` for good practice.\n\n Do you know what you're doing?", type = "yesno")

image

We'd make it unskippable.

We might include a summary of the compatibility between user's package license and used packages.

If user picks "no", "burgled.R" is set back to backup.

As said in the message above, there'd be a vignette, pointing to relevant sources, maybe summarizing important point if it's possible.

A second prompt might offer to automatically add contributors to the description, I like the format that Colin Fay proposed : https://twitter.com/_ColinFay/status/1387683979105538050 https://twitter.com/_ColinFay/status/1387733196746502148

I'd like to understand this too : https://twitter.com/hadleywickham/status/1388126198593695749 . I'm not sure how properly credited copied code would be illegal reproduction when a fork wouldn't.

Should dig into https://r-pkgs.org/license.html#code-you-bundle too.

After all this I believe we arguably have a package that facilitate correct attribution rather more than it facilitates code theft, since the quick and dirty ways they might have been using before would be replaced by quicker and cleaner and we'd raise awareness to the licensing issues.

brodieG commented 3 years ago

In re fork: the fork keeps all the copyright and licenses, etc. It is possible that a fork of some repos could be illegal if they have licenses that don't allow it. I believe most FoSS licenses allow forking so long as you don't modify and redistribute without the sources.

wch commented 3 years ago

FWIW, you cannot take code that's released under one license (for example, GPL-2), and then put it in a separate package and release it under another license (like MIT). (Unless you are the copyright holder, of course.) Attribution is not sufficient.

moodymudskipper commented 3 years ago

Thanks @brodieG and @wch for your input.

@wch does it mean that the simple example above that takes code from {Hmisc} (GPL (>= 2)) and {scales} (MIT) would only be legal for private use (not published on github/cran, not used by company etc), because there are 2 licenses and the package itself would have to pick one ?

Another idea on top of the rest would be to add a startup message to onAttach(), library(newpackage) would mention the packages that code was copied from :

library(newpackage)
#> {newpackage} uses code from {Hmisc} (GPL (>= 2)) and {scales} (MIT).  The author(s) agree(s) that they took care of potential license compatibility and attribution issues.

It wouldn't be optional, of course the package author might go out of their way to remove those, but that might be held against them if they do so wrongly, and we'd have made it certain that they'd have been made aware of the potential problems.

JohnCoene commented 3 years ago

I do apologise @moodymudskipper, I love your packages but I fail to understand the need for this: the downsides seem to far outweigh any possible upside or convenience.

Why {burglr} when one can namespace::function or @importFrom namespace function? If it's just install time/overhead I don't see it (probably not, this'll generally be caused by compilation which this does not cover I believe).

Attribution and licensing is already terribly difficult to handle as it is: we all get it wrong. I guess the above, at least, forces developers to think about it. GPL-2 is often mentioned because it is sadly often not respected (sharing modifications with main branch).

I'm not sure what correct attribution and licensing entirely entails but it seems (see package name) that this will not do so properly.

wch commented 3 years ago

@moodymudskipper In your example, that would only be allowable if the resulting package was released under the GPL>=2.

I was a bit unclear in my previous comment. You can't take GPL-licensed code and release it under the MIT license, but you can do the reverse: take MIT-licensed code and release it under the GPL (provided that the MIT license notice is included with the derivative work). That's my understanding, at least. The details will differ for different combinations of licenses.

See: https://en.wikipedia.org/wiki/GNU_General_Public_License#Compatibility_and_multi-licensing https://opensource.stackexchange.com/a/5548

Licensing of free software is a nontrivial topic to understand, and I strongly suggest spending a lot of time learning about it before encouraging users to simply take code from other projects.

I don't think that printing out the messages on startup is necessary. It's much more important that the licensing is done correctly; if it is done correctly, then there's no need to print anything on startup.

brodieG commented 3 years ago

I can't imagine that this package will be able to automatically ensure that copied code is always licensed/attributed/etc correctly. And if you do attempt and fail, who is responsible?

As others have pointed out, this is a very tricky topic. One important factor is to try to make sure that people using this package to copy understand that there are potentially complex licensing issues that THEY are responsible to figure out before they copy the code. Hence my suggestion for a run time warning. This is not legal advice (I am not a lawyer, etc.), and I take no responsibility for any thing that goes wrong as a result of anyone acting on it.

I do agree with @wch that you probably would benefit from studying licensing issues if you are going to keep this package live. But the most important thing is that you don't somehow end up caught up in the crossfire if someone uses your package and the result is badly licensed and the rights holder are angry about it. I have no idea if my suggestion of a runtime warning would help you in this case.

JohnCoene commented 3 years ago

it's a catch 22-22-22 the warning is needed because the attribution is not right, but because the attribution is not right the code should not be run (and the warning not display.)

You would not buy a car that came with a "may be stolen" warning, you would not run code that warned "you may be in breach of license or copyright law."

The warning is a symptom not a solution.

As @wch said, get the attribution right and the warning can go.

brodieG commented 3 years ago

I personally don't see how I could get comfortable enough writing software that will get attribution and licensing right automatically when copying code from one source to another. That just seems like a minefield to me.

moodymudskipper commented 3 years ago

I cannot guarantee that licensing will be done properly.

My idea is that I think the package, even without any licensing consideration, doesn't do anything illegal, and responsibility is on the user. Instead of doing nothing I'd rather warn and guide the user, for their own good and by respect to the developpers.

I'm sure a lot of code is stolen at the moment, and I thought maybe this package can be a net positive. People might steal code using it (their responsibility), but others might be made aware of license issues and not steal (either back off or make it work the right way).

I hear Brodie say I might actually not be safe as the author of this package, and I hear pretty much everyone here think the package has no way to be a net positive. I also hear elsewhere some imply that I have bad intentions.

I also don't plan on becoming an expert on licensing, I find the topic extremely boring indeed.

I'm a bit confused, there was enthusiasm on twitter and I don't think they were all thieves, but I also don't want to piss off the community, and I respect each of you and your opinions individually too, and am thankful of you taking the time to chime in, so maybe the bottom line is I should just archive it?

brodieG commented 3 years ago

@moodymudskipper just to be clear, at least as far as I stand, you are not pissing me off at all. I just wanted to make sure that you are aware there might be risk for you here (again, not a lawyer).

I do think the ability to easily excise a small set of no-dependency code from an otherwise dependency heavy package could be useful. IIRC I've seen some pretty long dependency chains brought in for a single function. I can't speak for others.

You might want to look at some history on enterprises that built useful tools that were then misused. Napster has some conceptual overlap with this (very tangential, granted).

wch commented 3 years ago

I think that many R users are not very knowledgeable about the different kinds of free software licenses out there. I think this package makes it very easy for people to violate software licenses without even knowing that they're doing it. Even adding warning messages isn't enough to ensure the users really understand the issues. I'm sure many users out there are even less interested than you are about learning about software licenses -- but it is essential to understand them if you're copying code around. The topic cannot ignored just because it's boring. (I find my taxes boring but I still have to understand them!)

Here's a scenario that is not far-fetched: Start with code from R itself, which is GPL licensed. Someone takes some of that code and puts it in an MIT-licensed package (which violates the GPL). Then a company takes that code, and, thinking it is MIT licensed, puts it in a software product that they are selling (without releasing the source code). It is OK to sell closed-source software that incorporates MIT-licensed code, but it is not OK to sell closed-source software that incorporates GPL-licensed code.

Now there's a problem. The R-core developers are upset because the code they wrote which is intentionally released under the GPL is being sold by someone, in violation of the GPL. The company is upset because they were misled into thinking the code was MIT licensed, and they are now exposed to legal liability. The package author is upset because burglr made it so easy to copy GPL code into their MIT package. (Of course, the package author is also at fault for not learning enough about software licenses.)

moodymudskipper commented 3 years ago

Thanks, I understand your point, I think the following is unlikely however : "The package author is upset because burglr made it so easy to copy GPL code into their MIT package."

If the first thing this author sees when running burglr::burgle() is a scary UI message box titled "Are you breaking the law?" with a summary of the issues at stake, emphasising the responsibility of the user, they cannot be upset that it was made too easy.

It is still easy though, and I agree that the rest of the scenario is plausible.

Maybe a good compromise is that I remove some flexibility, fail if any license is other than MIT or GPL, because they're less common and I don't want to learn about them, and fail, not warn, if there is any incompatibility between the licenses of main package and copied packages. I can offer a diagram of dependencies as a consolation in any case.