caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
58.12k stars 4.03k forks source link

Remove MLP-2.0 dependency #6613

Closed sbruens closed 3 weeks ago

sbruens commented 3 weeks ago

The embedded ACME server is pulling in a dependency that has an upstream dependency on github.com/go-sql-driver/mysql, which is licensed under MPL2.0. This makes packaging for downstream users difficult and may have other licensing incompatibilities:

$ go mod why github.com/go-sql-driver/mysql
# github.com/go-sql-driver/mysql
github.com/caddyserver/caddy/v2/modules/caddypki/acmeserver
github.com/smallstep/nosql
github.com/smallstep/nosql/mysql
github.com/go-sql-driver/mysql

I've already filed an issue to the upstream library that depends on it directly, but looking at how that library is used in Caddy, it might be easier to remove the dependency here. As far as I can tell, the only use is here in a type assertion:

https://github.com/caddyserver/caddy/blob/d7564d632fbed209e81978c5c2c529a7bf1836f7/modules/caddypki/acmeserver/acmeserver.go#L192

Could the dependency be removed and replaced with a mirrored interface, or another solution?

francislavoie commented 3 weeks ago

What's the incompatibility exactly? From my reading, there's no violation here. From the summary at https://github.com/go-sql-driver/mysql/blob/master/LICENSE, I don't see the issue.

hslatman commented 3 weeks ago

@sbruens it's not just the type assertion that is used: the call to ca.NewAuthority will instantiate a concrete instance of nosql.DB. That instance may, or may not, be a nosql/mysql instance; based on the DSN. As far as I know at this time Caddy will actually only use the bbolt implementation, so it never calls into the go-sql-driver/mysql package.

Like @francislavoie, I don't immediately see a big problem with this particular license in this case.

There is some work and discussion going on in https://github.com/caddyserver/caddy/pull/6097 to support more database types, and in https://github.com/smallstep/nosql/pull/53 to rearchitect the implementation (and dependencies) of the nosql package.

We'll discuss https://github.com/smallstep/nosql/issues/68 in our upcoming open source triage meeting.

sbruens commented 3 weeks ago

MPL-2.0 is not a permissive license like Apache-2.0, but a "weak copyleft" license. They are not necessarily incompatible, but anyone distributing software with MPL-2.0 components needs to make the source code for those parts available to end users. This creates additional obligations for users of Caddy that are more cumbersome than if all the code was under a more permissive license.

Specifically, we cannot distribute any server binaries that use Caddy without making the source code of that MPL-2.0 library available and making sure end users can find it. That is the reasoning for this request. I believe this affects any Caddy user that wants to distribute software outside their organization who might currently be at risk of being in violation due to this dependency. It could also prevent new users from adopting Caddy to avoid the hassle if they did their due diligence with a license check like https://github.com/google/go-licenses.

I don't know what the obligation is precisely for the Caddy project itself and its binaries (if any), I only know what my obligation is as a user of Caddy. I'm certainly not a licensing expert by the way, but Q7 and Q8 of https://www.mozilla.org/en-US/MPL/2.0/FAQ/ give a good high-level answer. You should consult your own lawyer though.

I hope that helps for context.

@sbruens[](https://teams.googleplex.com/sbruens@google.com) it's not just the type assertion that is used: the call to ca.NewAuthority will instantiate a concrete instance of nosql.DB. That instance may, or may not, be a nosql/mysql instance; based on the DSN. As far as I know at this time Caddy will actually only use the bbolt implementation, so it never calls into the go-sql-driver/mysql package.

@hslatman understood, I didn't look at the code in detail.

Thanks for considering.

francislavoie commented 3 weeks ago

That's not my reading of it.

@sbruens but anyone distributing software with MPL-2.0 components needs to make the source code for those parts available to end users

But it is available on GitHub. From the summary of the license on GitHub:

https://github.com/go-sql-driver/mysql/blob/master/LICENSE However, a larger work using the licensed work may be distributed under different terms and without source code for files added in the larger work.

So any changes that aren't to the library itself can use a different license and doesn't need to be code-available.

@sbruens but Q7 and Q8 of https://www.mozilla.org/en-US/MPL/2.0/FAQ/ give a good high-level answer

To me, that effectively just means "link to the GitHub repo where the source is available". That's all.

Anyway, I'm looking forwards to nosql being more modular after https://github.com/smallstep/nosql/pull/53. For now, I think there's nothing actionable for us here.

mholt commented 3 weeks ago

Specifically, we cannot distribute any server binaries that use Caddy without making the source code of that MPL-2.0 library available and making sure end users can find it.

It says:

such Covered Software must also be made available in Source Code Form,

It is available, on GitHub. I don't think there is anything that says we need to copy it and make it available, especially since we're not changing it. 🤷

I'll close this, unless there's something shown to be actionable, we can reopen it.

sbruens commented 2 weeks ago

I'm not saying that you have obligations, though if you distribute Caddy as a binary you most likely do, but that's beside my point. Again, you probably want to discuss this with your own legal advisors.

I'm saying that we have obligations as users wanting to distribute artifacts/binaries that use Caddy. I'm not here to discuss the legal interpretation of this well-known license. This is a request to make it easier for us (and by extension other Caddy users) to comply with the licensing obligation that you are introducing by depending on an MLP-2.0 dependency. Removal of the library would make it much easier to distribute Caddy-backed binaries. In the absence of that, maybe a clear call-out somewhere in the docs that there is a non-Apache license that may pose legal obligations on users would still be better than nothing.

Anyway, it's fine if this is not important to you. I just want to make sure you are actually understanding my request here as a user before dismissing it.

mholt commented 2 weeks ago

I'm saying that we have obligations as users wanting to distribute artifacts/binaries that use Caddy.

What obligations are those? You did say earlier:

making the source code of that MPL-2.0 library available and making sure end users can find it.

but as I am saying, I don't think that's your responsibility (or ours) -- since it is already available and can be found.

So I guess I still struggle to see the problem here.

Ease of distribution is important to us, I am just not sure what we are able/supposed to do about it ourselves at this point.

We'll see what we can do, but as noted above, this is a dependency of a dependency, and we can't really rip out that first/direct dependency. They will have to act on this more than we will I think, unless I'm missing something. Do you have a specific request for us?

sbruens commented 2 weeks ago

but as I am saying, I don't think that's your responsibility (or ours) -- since it is already available and can be found.

it is 100% our responsibility as per the license to make sure the source code we are using executable form is available. We also need to inform recipients of our binaries that we are using this code and ensure they have access to the source code and the license, and know where to find it. 3.2(a) is pretty clear on that:

such Covered Software must also be made available in Source Code Form, as described in Section 3.1, and You must inform recipients of the Executable Form how they can obtain a copy of such Source Code Form by reasonable means in a timely manner, at a charge no more than the cost of distribution to the recipient; and

To make it concrete, this means for our use case we now need to do 2 things as part of our release:

  1. Make sure the code and the license is available. We could bundle directly with the binary or point to where the user is guaranteed to be able to access it and get a copy. We are distributing the executable, so it is our responsibility to make sure that's available. For our organization a non-controlled GitHub repo that we cannot guarantee is available is not sufficient so we will likely either bundle or mirror in this case, but other organizations may choose differently and point at the original GitHub repo (as has been suggested in this thread).
  2. Bundle some documentation with the binary (in say a tarball or a Docker image or whatever) with clear instructions on where the end user can get a copy of the source code, and do that for every distribution or release that contains this binary.

This comment thread has mostly focused on 1., but 2. is equally required.

We'll see what we can do, but as noted above, this is a dependency of a dependency, and we can't really rip out that first/direct dependency. They will have to act on this more than we will I think, unless I'm missing something. Do you have a specific request for us?

I hear you. When I initially filed this I wasn't sure how it was used (I just did a quick scan and misread it as just an interface use, which @hslatman corrected) and hoped it would be an easy change worth doing.

Do you have a specific request for us?

Besides removal/replacement, if there's a way to not bundle the acme server by default (or however else this gets pulled in?), this can help so users can choose not to include the dependency if they have no use for it. I assume this might be hard to do to keep backwards compatible.

Alternatively, call the dependency out somewhere like https://caddyserver.com/docs/build and https://caddyserver.com/download so that users who are considering building and distributing from source or from the pre-built binaries are aware that there might be licensing obligations when they do so, depending on who they distribute to. Right now it's totally opaque unless you dig through the Go dependency tree.

sbruens commented 2 weeks ago

PS: I don't mean to come off too strong with the wall of text. I just want to make it unambiguous what the issue is, so you can make an informed decision. I love Caddy and the work you al have done, that hopefully goes without saying but I'll say it anyway!

hslatman commented 2 weeks ago

I've done a quick compilation of Caddy with and without the nomysql build tag:

go build -o caddy-mysql cmd/main.go
go version -m caddy-mysql | grep -i mysql
caddy-mysql: go1.23.2
    dep github.com/go-sql-driver/mysql  v1.7.1  h1:lUIinVbN1DY0xBg0eMOzmmtGoHwWBbvnWubQUrtU8EI=

go build -o caddy-nomysql -tags nomysql cmd/main.go
go version -m caddy-nomysql | grep -i mysql
caddy-nomysql: go1.23.2
    build   -tags=nomysql

So when built with -tags nomysql the github.com/go-sql-driver/mysql package seems to not become part of the build. This might be a way forward for certain use cases. There's a precedent with the nobadger build tag, which was required for technical reasons.

Now, this requires someone distributing the binary to explicitly build it with the -tags nomysql. It can be done automatically in the builds provided on the Caddy website, and possibly through xcaddy as well, but would need people to at least know about it somehow.

Caddy uses the bbolt implementation by default: https://github.com/kylemcc/caddy/blob/e25e118528ca1800997d6acdc1f31acefb93192d/modules/caddypki/acmeserver/acmeserver.go#L204-L228, and there's currently no way of configuring a MySQL database as the storage, so in practice the execution path never goes through the github.com/go-sql-driver/mysql code, but it probably still gets linked against to support some runtime assertion or reflection of some kind, I suppose.

That said, with https://github.com/caddyserver/caddy/pull/6097 support for MySQL will be an option, so then it might actually be used in practice. Without further changes, it could still be disabled with the nomysql build tag, though.

In our open source triage meeting we did discuss this issue, and one of the things we decided is that we will have a look into the feasibility of changing to a different Go MySQL driver. We can't give guarantees at this time whether or not it will be implemented, as it might involve quite a bit of work, and could break things for existing deployments.

francislavoie commented 2 weeks ago

We also need to inform recipients of our binaries that we are using this code and ensure they have access to the source code and the license, and know where to find it.

Well, Caddy itself doesn't hide that. The full dependency chain is visible via go.mod and also in the executable with the build-info command:

$ caddy build-info | grep mysql
dep github.com/go-sql-driver/mysql  v1.7.1 h1:lUIinVbN1DY0xBg0eMOzmmtGoHwWBbvnWubQUrtU8EI=

So anyone who cares can easily trace it to find the information they need. Tools like https://github.com/google/go-licenses can also be used to trivially report on the license information.