opensvc / multipath-tools

Other
59 stars 47 forks source link

Please license remaining GPL-2.0-only files as 2-or-later (or LGPL?) #36

Closed zeha closed 2 years ago

zeha commented 2 years ago

Hi,

it appears the strbuf code is licensed under GPL-2.0-only (libmultipath/strbuf.c, libmultipath/strbuf.h, tests/strbuf.c). From a quick grep, those files seem to be the last (relevant) GPL-2.0-only files.

Given multipathd is intended to be linked against readline (which is GPL3+), I would expect that the strbuf files are supposed to be under a compatible license. Maybe they are all authored by SUSE LLC and @mwilck can quickly improve this situation?

Thanks, Chris

zeha commented 2 years ago

Debian bug ref https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=979095

mwilck commented 2 years ago

Yes, I can change the strbuf license. But as pointed out in the Debian bug, these are not the only GPL-2.0 files. There are some more:

  1. libmultipath/prioritizers/ontap.c
  2. libmultipath/prioritizers/datacore.c
  3. libmultipath/uevent.c
  4. libmultipath/sysfs.c
  5. libmultipath/list.h

1, 2 are only needed for legacy hardware. 3.-5. are important, though. 3. and 4. were once copied from the udev sources. 5. was copied from the Linux kernel.

We can't easily relicense any of these files. If this means we can't use libreadline, then we must probably just drop it.

mwilck commented 2 years ago

There are also GPL-2.0 licensed files in kpartx. But that doesn't link to libreadline.

mwilck commented 2 years ago

I've posted a patch "replace libreadline by libedit" to dm-devel.

JanZerebecki commented 2 years ago

Perhaps the people who contributed to those files are willing to grant another license.

zeha commented 2 years ago

1, 2 are only needed for legacy hardware. 3.-5. are important, though. 3. and 4. were once copied from the udev sources. 5. was copied from the Linux kernel.

Right, I grepped for SPDX-License-Identifier given this is used in (apparently just some) files. Thanks!

hreinecke commented 2 years ago

It should be possible to get in touch with Kay Sievers to change 3-5; I'll check what I can do.

hreinecke commented 2 years ago

I have feedback from Kay Sievers, and he's fine with changing the license of any code attributed to him. Exact quote (in german, I'm afraid):

Von meiner Seite aus, den Teil den ich geschrieben habe, das kannste
alles gern aendern in jede beliebige GPL-Lizenz die fuer deine Zwecke
besser passt.
mwilck commented 2 years ago

@zeha, I'd like to understand the concern better. IIUC, it is questionable whether the act of linking alone makes multipathd a "derived work" of libreadline, which would be the legal argument to request releasing multipathd under a GPL-3.0 compatible license.

The readline/history functionality only matters when multipathd is used as interactive command line client. This is a small and rather unimportant part of multipath-tools' functionality. In fact, I believe it wouldn't be hard to move the client functionality into a separate binary which wouldn't link to libmultipath any more and thus could be released under GPL 3.0, avoiding the GPL incompatibility concern.

Can you elaborate?

JanZerebecki commented 2 years ago

unimportant part

AFAIK, that is not the bar for derived work. It would need to be so unimportant that it is below the level for which copyright is granted (even the choice of case of variables is AFAIK copyrightable). And that would need to be at the time when it was potentially derived. Later rewriting on top of a replacement is not a defense for an earlier derivation.

It is difficult to argue the details of derived work. In this case so much easier to fix the license because you already have permission.

I suspect you do not like the GPL, that is OK, don't use it then. If you want to get around the intention of the GPL I will certainly not help you.

hreinecke commented 2 years ago

Please don't make assumptions. The sheer fact that we're discussing these issues shows that we care. We try to be GPL compliant, but we cannot change history. As to changing the license: whom are we required to contact? Is is sufficient to contact the copyright owner as specified in the source code? Or does each and every contributor to that file need to be contacted? Do you know?

mwilck commented 2 years ago

AFAIK, that is not the bar for derived work. It would need to be so unimportant that it is below the level for which copyright is granted (even the choice of case of variables is AFAIK copyrightable).

That's the PoV of the FSF, right? I don't think it's universally accepted.

And that would need to be at the time when it was potentially derived. Later rewriting on top of a replacement is not a defense for an earlier derivation.

At the time multipath-tools started using libreadline, the latter was under GPL v2.0 or later, and there was no license conflict. The conflict arose only because the FSF chose to create a successor license that was incompatible with its predecessor, and because readline chose to switch to this new license. Our mistake — as multipath-tools maintainers — was not to realize this.

It is difficult to argue the details of derived work.

We agree on this point. It's also known to have different meanings in different jurisdictions. I am just saying that your very broad interpretation isn't everyone's, even if it might actually have been the intention of the GPL's author. I tend to prefer a more common-sense interpretation. multipath-tools has ~75000 lines of C code, of which 10 (including comments and #include statements) reference libreadline. It just makes no sense to me to speak about a "derived work" here.

In this case so much easier to fix the license because you already have permission.

This is not true. We have permission from Kay thanks to Hannes' inquiry, and I'm assuming that the current active contributors would also grant permission. But we'd need permissions from quite a few more people to be clean.

I suspect you do not like the GPL, that is OK, don't use it then. If you want to get around the intention of the GPL I will certainly not help you.

If you look at the original issue, you can see that the problem was that I chose GPL-2.0-only as license for some of my contributions. Which was a mistake, I should have used GPL-2.0-or-later; but I was unaware of the libreadline problem. Also, you certainly realize that even if we wanted (we don't), we can't stop using the GPL. Almost the entire code base of multipath-tools is under some variant of the (L)GPL.

As Hannes already, said, we are actively working on cleaning up this situation.

mwilck commented 2 years ago

whom are we required to contact? Is is sufficient to contact the copyright owner as specified in the source code? Or does each and every contributor to that file need to be contacted? Do you know?

My €0.02: It's certainly not sufficient to contact just the people mentioned in the copyright note on top of the files. The "intellectual property" arises from writing the code, not from being mentioned in the header. I have analyzed the git history of the files in question and created lists of contributors. While not perfect, I believe that this would have a realistic chance to be regarded as "due diligence" — if all these contributors did express their consent.

Unfortunately, the list contains at least one person that I don't know how to contact, and kernel people (including Linus) who have publicly stated that they won't change the license of their code to GPL 3.0.

mwilck commented 2 years ago

FTR, patches posted so far:

As soon as these are reviewed, I'll create a PR here including them.

I didn't create the 2nd patch because I "do not like the GPL", but because it's the only short-term way to avoid the licensing conflict for the current upstream code. Obtaining permissions from all contributors, if possible at all, will take significantly longer.

JanZerebecki commented 2 years ago

Please don't make assumptions. The sheer fact that we're discussing these issues shows that we care.

Sorry for my wording. I assume you all, including Martin, do care. I was advising Martin elsewhere how to fix this, because he asked. And I only wanted to make it clear that I won't be helping him properly make not-a-derived-work claim, as I feel doing that is not in its spirit and doing it repeatedly could weaken the GPL, as the words of licenses stay the same while meanings change with how they are applied in practice.

AFAIK, that is not the bar for derived work. It would need to be so unimportant that it is below the level for which copyright is granted (even the choice of case of variables is AFAIK copyrightable).

That's the PoV of the FSF, right? I don't think it's universally accepted.

No, I think this is usually how USA courts apply copyright. The "creative" or "original" act required is quite laughably small. AFAIK the FSF never was copyright maximalist, the GPL was their response to copyright expansion. (Though I don't like the FSF for other reasons.)

It just makes no sense to me to speak about a "derived work" here.

These are not normal words, but ones with a specific definition. The term "derivative work" (edited: derived is the wrong word) has a specific definition in USA copyright law (see 17 U.S.C. § 101) and although the term is not explicitly mentioned a similar meaning is included in the Berne Convention.

We try to be GPL compliant, but we cannot change history.

I think it can be fixed without changing the past, even without changing old releases. You can get a license without that being said in the source code. It is possible for someone to retroactively grant a license to someone else for uses that already happened and those in the future just by saying that. The license does not need to be shipped in the source tar to be valid.

I have analyzed the git history of the files in question and created lists of contributors.

I think that is a good way. Things to look out for are: Git move and copy detection and following that history for dealing with that. Notes about files copied from somewhere and doing the same at the source. Notes about additional authors, e.g. git trailers Co-Authored-By and Signed-off-by (if not the same as Author).

There are some big code bases that changed licenses that published tales about how they did it, but I can't find any links to what I remember right now.

Unfortunately, the list contains at least one person that I don't know how to contact

If you don't find a way, there are still some things one can do, like check with next of kin, etc, but for some things I'd like to check with a lawyer. Ping me privately if you need to go there.

kernel people (including Linus) who have publicly stated that they won't change the license of their code to GPL 3.0.

Linus only said this about the kernel as a whole. Many parts of the kernel come with compatible or additional licenses. Granting an additional license to some kernel file was done many times. And if they just dislike licensing under GPL-3.0, granting a compatible one e.g. MIT also works.

xosevp commented 2 years ago

@bmarzins adding Benjamin and @cvaroqui Christophe

hreinecke commented 2 years ago

Just to throw in more confusion: I've created a patch which replaces libreadline with a simple fgets(). You do lose the history and tab completion, but as CLI usage still is a rather obscure feature it should be good enough as a drop-in replacement in constrained environments (like existing distribution packages where the distribution doesn't ship libedit).

zeha commented 2 years ago

@mwilck sorry, I'm not an expert on these matters (as you can see...). My understanding is just that various groups incl. Debian think the matters create a legal problem. Maybe @bgermann who filed the Debian bug can explain this better.

bgermann commented 2 years ago

Yes, in Debian linking against a library is considered creating a derivative work of it as far as I know. You will at least not get your package into Debian because ftpmasters check on GPL-3 vs. GPL-2 really thoroughly. I do not think that the proposition that GPLv2 and GPLv3 are incompatible and cannot legally apply to different parts of one work is controversial. When the readline license change happened, many packages just switched to the new version without checking the license compatibility first. There is no regular license check on existing packages, so this bubbles up only on people noticing this by chance for existing (pre license change) packages.

bgermann commented 2 years ago

Thanks for caring and posting the libedit patches.

mwilck commented 2 years ago

IIUC, the libedit switch is acceptable for Debian project? Will Debian retroactively change packages in older releases? (I noted that until buster, multipath could have linked against libreadline5, avoiding the issue, not sure if this has been actually done).

bgermann commented 2 years ago

Yes, the libedit switch is acceptable for Debian. The package maintainer can contact the release team about older releases. My guess is that they will not be changed because it is not a security concern. The buster and stretch versions link to libreadline7.

mwilck commented 2 years ago

OK, thanks.

@bmarzins (or who else feels inclined), a Reviewed-by: for my dm-devel patch would be appreciated ;-)

bgermann commented 2 years ago

Please note that Debian needed one other patch that I posted a while ago, which has an include that is implicitly added by readline.

mwilck commented 2 years ago

Please note that Debian needed one other patch that I posted a while ago, which has an include that is implicitly added by readline.

It was included in my set, I just didn't mention it here explicitly. See https://listman.redhat.com/archives/dm-devel/2022-August/051893.html

mwilck commented 2 years ago

I've created an additional patch set that hopefully improves matters more, just posted to dm-devel with the headline [RFC PATCH 0/9] Split libmultipath and libmpathutil.

First, it includes a modified version of Hannes' patch from #41, and thus enables not using a readline library at all in multipath-tools (the loss of functionality is very small). Furthermore, a new command multipathc is created that takes the role of the interactive command line client. It becomes the only part of the code that uses libreadline functionality. In interactive mode, multipathd now just exec()s this new program. multipathc uses none of the GPL-2.0-only code of libmultipath (it doesn't link to libmultipath at all), and can thus link to libreadline without a license conflict.

mwilck commented 2 years ago

Something seems to be wrong with dm-devel, my patchset from Aug 19th hasn't arrived there yet, and the listman server seem to be down.

xosevp commented 2 years ago

Something seems to be wrong with dm-devel, my patchset from Aug 19th hasn't arrived there yet, and the listman server seem to be down.

RH team are working on it. @kergon

mwilck commented 2 years ago

My patch set (revised / v3) can meanwhile be inspected in the openSUSE tip branch, starting with 1cd0b20.

kergon commented 2 years ago

The mailing list is restored and functioning correctly again as far as I can tell. The webserver is waiting for a new SSL certificate. Once it's back I'll send a message to the list.

kergon commented 2 years ago

Most list recipients didn't receive last week's messages, but we need the list archive web page to be back so we can tell people where to get them.

Mailman failed to read its list config file successfully (we haven't got to the bottom of this yet), but that meant it fell through into logic designed to handle a file format upgrade coded around 2001. The old format file was still present, so it behaved as if it was the first time the new-format code was running and re-converted that file. In effect it wound back the mailing list to its state 18 years ago!

mwilck commented 2 years ago

@kergon, thanks for taking care and keeping us informed. Sounds like a nasty event indeed. The list server has been working almost flawlessly for may years, so there's no reason for us to complain.

JanZerebecki commented 2 years ago

multipathc uses none of the GPL-2.0-only code of libmultipath (it doesn't link to libmultipath at all), and can thus link to libreadline without a license conflict.

No, that doesn't work, the license is literally worded in a way to avoid allowing that.

But the accepted patch looks good, as it does not link to GNU readline by default: https://github.com/opensvc/multipath-tools/pull/42/commits/b7771447b971e4cb7f549f45663739fe6370164d Thank you.

mwilck commented 2 years ago

No, that doesn't work, the license is literally worded in a way to avoid allowing that.

Show me those words.

We aren't playing tricks to avoid the GPL. We make it formally obvious that the part that depends on libreadline is actually a separate program that has almost nothing in common with multipath or multipathd (except some generic utilitiy code which comes under GPL-2.0-or-later). That has always been the case, it was just hidden by the fact that this code was executed in multipathd.

mwilck commented 2 years ago

One thing we might consider is to replace list.h by the code from systemd, which is written from scratch and comes under LGPL-2.1-or-later. Combined with Kay's statement and a few more OKs from other people, that might enable us to relicense libmultipath under GPL-2.0-or-later. It still wouldn't get us a retroactive license though.

Technically, systemd's list.h is different from the one derived from the kernel. Most importanty, it doesn't use the same struct for the list head and the list item.

JanZerebecki commented 2 years ago

No, that doesn't work, the license is literally worded in a way to avoid allowing that.

Show me those words.

GPL 2 doesn't contain the word link or shared. It is worded that way to also potentially trigger the license when you make it execute another process. GPL 3 does mention it and it leads to an helpful clarification in the context of what is corresponding source despite the system library exception: "by intimate data communication or control flow between those subprograms and other parts of the work". Which makes it clear, just avoiding directly linking is not a sufficient mechanism to not need a license. However those word do not imply the opposite as it does not say that without intimate communication its excluded.

Linking triggers the license by among others copying and modifying the header files and object code of the library. Using the same file format from two processes could also trigger the license in the same way.

Note: I'm merely saying your reason for saying that this satisfies the licenses is wrong. I don't know if your concrete patch makes it so it now satisfies the licenses (and you only gave the wrong explanation).

I have not reviewed the code of the patch you mentioned, and don't have enough time to review if it would make linking to GNU readline ok. But we are in contact elsewhere with someone we could review it, so I link it here: https://listman.redhat.com/archives/dm-devel/2022-August/051969.html

The cover letter does not quite contain the required information to know if it does. When you said "links to libmpathutil only" maybe you also mean that there is not even indirect interaction between the now split parts and no common interface (like a file format or protocol both support). I.e. you could install one split part and use it and then delete the whole machine and reinstall with only the other part and you would not get any additional functionality when both are present instead of only one. But all that does is still only an indicator or better smell test which has false positives and negatives (though useful for downstream distributors to spot license problems) and anyway only works for newly and independently written code.

The actual test is different.

mwilck commented 2 years ago

Note: I'm merely saying your reason for saying that this satisfies the licenses is wrong. I don't know if your concrete patch makes it so it now satisfies the licenses (and you only gave the wrong explanation).

This GH issue was about linking between GPL-2.0-only and GPL-3.0 code (see OP's description). With my patch, this linking doesn't happen any more and thus the OP's concern of a licensing conflict in multipath-tools has been fixed. I didn't claim that "this satisfies the licenses". I said that none of multipathc's code, or in any library that it links to, is under a license that it's incompatible with GPL-3.0, and that linking this code with libreadline is therefore allowed. This is my understanding still.

IIUC, you argue that we couldn't release multipathc under GPL 3.0 (even though we wrote the code) because it "communicates" with code under GPL 2.0 (which we also wrote). Sorry, I can't help myself, I think that this is outrageous and unhelpful.

The cover letter does not quite contain the required information to know if it does. When you said "links to libmpathutil only" maybe you also mean that there is not even indirect interaction between the now split parts and no common interface (like a file format or protocol both support).

No, I meant what I was saying. There is a protocol that both support and use to communicate, the multipath daemon's command protocol. It's a very simple protocol, and it's used by ever program that uses libmpathcmd or libdmmp (which has been shipped under GPL-3.0 since day one btw). According to the previous assessments we made, the implementation of this protocol is entirely in files licensed under LGPL-2.0-or-later or LGPL-2.1-or-later. I never imagined that usage of this protocol would make arbitrary code a derivative work, or otherwise imply that this other code had to be under the same license as our code, and I'd swear the same holds for the other contributors.

I.e. you could install one split part and use it and then delete the whole machine and reinstall with only the other part and you would not get any additional functionality when both are present instead of only one.

If this was the case, we could ditch one component entirely, no? And no, it's not the case. multipathc provides the "interactive shell" functionality, which multipathd doesn't have. multipathc alone is useless, like systemctl without systemd (in theory it could talk to a different daemon implementing the same protocol, but no such daemon exists).

JanZerebecki commented 2 years ago

Thank you for the explanation about the protocol. But that means the situation is complicated.

I'll be soon on vacation for more than a month. I'm sorry I can't properly answer you now, but I hope the person we discussed this with elsewhere can answer your questions and I'm certainly interested in what they answer.