Closed Fuuzetsu closed 9 years ago
Don't let anyone stop you from doing that!
+1
I often wondered if there was a reason why the nixpkgs manual uses plain strings: http://nixos.org/nixpkgs/manual/#chap-meta
I'll have a shot at changing that.
…and just two subsections later it introduces licenses.…
I will have a look it doing the replace in the near future if no one beats me to it, I wanted to gauge interest first.
Just found #739 which is relevant. I'll be doing this now.
I have now replaced a lot of the strings with a meta expression, see my branch here.
Now the problems are as follows:
nixpkgs
that are labelled in too generic way: saying BSD
(or free
is not useful because there are 3 most popular BSD licenses (4 clause, 3 clause, 2 clause). lib.licenses only covers for 3 and 2 clause.BSD
or free
while they were actually something completely different such as GPL or MIT.In summary what needs to happen is: come up with a scheme for multi-licensing, come up with a scheme to accomodate custom or one-off licenses and go through remaining ‘ambiguous’ strings (such as BSD
), look up the license and make it more specific.
My ideas:
customFree
and customUnfree
which take some metadata and a file path. The downside is that it's more work than putting down ‘unfree’ or whatever but the upside is that it's now much easier to query. We would have a directory with these custom license files. Doing it this way would allow the user in the future to do more selective license enabling: on Gentoo we can allow and disallow specific packages and it's great. Currently with nix we can only say whether we allow unfree packages or not: if you need unfree drivers on a machine, it doesn't automatically mean you don't care if Flash or whatever gets pulled in when you aren't watching.BSD
license string. I'm unsure what the situation is in other distros. I started manually going through but it would take me days to do it on my own. I think the first step is to publish a list of packages which have unclear license fields and ask maintainers/volunteers to fix them up. This should probably be done after multi-licensing and custom licensing is figured out.I will squash my branch, remove any whitespace changes and make a PR soon.
FTR here are the remaining unique license strings (I now notice a couple instances I missed when manually going through but the rest still applies):
license = "";
license = "AFL-2.1";
license = "AFL-2.1 or GPL-2";
license = "apache";
license = "artistic";
license = "Artistic-2";
license = "ASF";
license = "as-is";
license = "as-is"; # gentoo is calling it this way..
license = "based on the PHP license - as is";
license = "boost";
license = "Boost 1.0";
license = "boost-license";
license = "bsd";
license = "BSD";
license = "BSD-derived (http://www.repoze.org/LICENSE.txt)";
license = "BSD"; # http://anonscm.debian.org/viewvc/muscleapps/trunk/muscleTool/COPYING?view=markup
license = "BSD License";
license = "BSD-like";
license = "BSD-like (http://repoze.org/license.html)";
license = "bsd"; # multi BSD GPL-2
license = "BSD"; # New BSD license
license = "BSD-Original";
license = "BSD"; # Parallax license, like BSD I think
license = "bsd"; # SGI-B-2.0, which seems BSD-like
license = "BSD"; # Simplified BSD License
license = "BSD-style";
license = "BSD-style, see `license.txt'";
license = "BSD"; # they don't specify which BSD variant
license = "BSL1.0"; # Boost Software License,
license = "CC-PD";
license = "CDDL"; # Common Development and Distribution License
license = "CeCILL-A";
license = "CeCILL B FREE SOFTWARE LICENSE or CeCILL FREE SOFTWARE LICENSE";
license = "CeCILL-B_V1";
license = "CeCILL FREE SOFTWARE LICENSE AGREEMENT";
license = "Click"; # MIT with extra clause, https://github.com/kohler/t1utils/blob/master/LICENSE
license = "CPL-1.0 GPL-2 LGPL-2.1"; # one of those
license = "Eclipse Public License 1.0";
license = "EPL";
license = "EPLv1.0";
license = "FastCGI see LICENSE.TERMS";
license = "free";
license = "free"; # !?
license = "free"; # Combination of LGPL/X11/GPL ?
license = "free"; # https://github.com/clvv/fasd/blob/master/LICENSE
license = "free"; # http://www.info-zip.org/license.html
license = "free"; # LaTeX Project Public License
license = "free"; # many parts under different free licenses
license = "free"; # mix of packages under different licenses
license = "free"; # more free licenses combined
license = "free non-commercial"; #Kermit http://www.columbia.edu/kermit/ckfaq.html#license
license = "free-non-copyleft";
license = "free-noncopyleft";
license = "free, non-copyleft";
license = "free-noncopyleft"; # Apache License fork, actually
license = "free-noncopyleft"; # giftware
license = "free-non-copyleft"; # http://www.libpng.org/pub/png/src/libpng-LICENSE.txt
license = "free-non-copyleft"; # some custom as-is in file headers
license = "free-non-copyleft"; #TODO W3C
license = "free"; /* OSL, see http://www.opensource.org */
license = "free"; # public domain
license = "free, see http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=license";
license = "free"; # seems BSD-like
license = "Free software ?";
license = "free"; # The libs are of LGPLv2.1+, some other pieces are GPL.
license = "free"; #TODO BSD on Gentoo, looks like MIT
license = "freeware";
license = "freeware"; # as an aggregate - data files have different licenses
license = "GNU LGPL";
license = "GNU Library General Public License version 2, with the special exception on linking described in file LICENSE";
license = "gpl";
license = "GPL";
license = "gpl_3";
license = "GPL,free";
license = "GPL/LGPL";
license = "GPLv2+ and BUILD license";
license = "GPLv2 + exception";
license = "GPLv2+ + exception";
license = "GPL-v2 / LGPL-v2.1";
license = "GPLv2/ZPL";
license = "GPL (various)"; # Mix of public domain, Artistic+GPL, GPL1+, GPL2+, GPL3+, and GPL2-only... TODO
license = "GPL with exceptions or ZPL";
license = "Hewlett-Packard BSD-like license";
license = "http://www.hpl.hp.com/personal/Hans_Boehm/gc/license.txt";
license = "http://www.isc.org/sw/dhcp/dhcp-copyright.php";
license = "http://www.pythonware.com/products/pil/license.htm";
license = "http://www.teamspeak.com/?page=downloads&type=ts3_linux_client_latest";
license = "iasl"; # FIXME: is this a free software license?
license = "IBM Public License";
license = "ISC";
license = "lgpl";
license = "LGPL";
license = "LGPL-2.1 Apache-2.0";
license = "LGPL+link exception";
license = "LGPL+linking exceptions";
license = "liberal"; # a non-copyleft license, see `Copyright' file
license = "LPPL-1.2"; # LaTeX Project Public License
license = "mBSD";
license = "MIT-like";
license = "MIT / LPL";
license = "MonetDB Public License"; # very similar to Mozilla public license (MPL) Version see 1.1 http://monetdb.cwi.nl/Legal/MonetDBLicense-1.1.html
license = "Most Ocamlnet modules are released under the zlib/png license. The HTTP server module Nethttpd is, however, under the GPL.";
license = "MPL";
license = "MPL1.1";
license = "New BSD";
license = "ngrep"; # Some custom BSD-style, see LICENSE.txt
license = "non-commercial";
license = "non-free";
license = "nonfree";
license = "non-free"; # Basically "not for commercial profit"
license = "nonfree"; #MicroChip-PK2
license = "null";
license = "OFL";
license = "OpenSceneGraph Public License - free LGPL-based license";
license = "Open Software License v1.1";
license = "open_source";
license = "open source, see included files";
license = "PayPal SDK License";
license = "permissive";
license = "PHP+";
license = "PHP-3";
license = "PSF License";
license = "PSF or ZPL";
license = "public domain";
license = "publicDomain";
license = "Public Domain";
license = "public domain, Python, 2-Clause BSD, GPL 3 (see COPYING.txt)";
license = "Python 2.1.1";
license = "Python+LLNL";
license = "QPL";
license = "QPL, LGPL2 (library part)";
license = "Qwt License, Version 1.0";
license = "revised BSD";
license = "revised-BSD";
license = "Ruby";
license = "samsung"; # Binary-only
license = "SciLab";
license = "SIL";
license = "SOME OPEN SOURCE LICENSE"; # TODO which exactly is this?
license = "SSLeay";
license = "Standard PIL License";
license = "?"; # the .py file is GPLv2
license = "TrueCrypt License Version 2.6";
license = "ttf2pt1";
license = "unfree-redistributable";
license = "unfree-redistributable"; #Amazon
license = "unfree-redistributable"; # Amazon http://aws.amazon.com/asl/
license = "unfree-redistributable"; # Amazon || (Ruby GPL-2)
license = "unfree-redistributable"; #TODO freedist, libs under BSD-3
license = "Unicode Fonts for Ancient Scripts";
#license = "unknown";
license = "unknown";
license = "UNKNOWN";
license = "unrestricted";
license = "unspecified"; # !
license = "verbatim-redistribution";
license = "Vovida 1.0"; # See any header file.
license = "VXL License";
license = "w3c"; # http://www.w3.org/Consortium/Legal/
license = "WTFPL"; # http://sam.zoy.org/wtfpl/
license = "zlib/libpng";
license = "ZLIB/LIBPNG"; # see README.
license = "ZPL";
isFree = false;
. People could provide more info by the usual attributes, notably the link. (If not defined, it would be assumed isFree = true;
as it's so in most cases.) Also, Hydra might care about isRedistributable = false;
rather than about freeness, as HW-related important stuff is often ufree but redistributable.meta.isEncumbered
.Problem with using an external link is that it's often not possible to do easily: sometimes licenses are only in source file headers (do we link to some random file?) or inside tarballs (do we link to the tarball? Do we update with each version? We don't really want to download the whole thing to read the license). It's also not great for tools &c. It's not a big problem though, it just seems less convenient for the user to have to chase up the license themselves.
I wonder if http://www.monkey.org/~scottij/oss-license-extract.html (or something similar) can help us clean up the licenses.
@Fuuzetsu: what about linking the debian copyright file? Example: http://metadata.ftp-master.debian.org/changelogs/main/z/zlib/zlib_1.2.8.dfsg-1_copyright
(Only in those cases where upstream provides no good license link. Debian seems to take licensing very seriously.)
@cillianderoiste Interesting although I don't know how well it works in practice, I've never heard about it before.
@vcunat Ah, it does seem like they have a lot of licensing information. I'm unsure about linking to it (they might move it or they might not have the version we do) but it should definitely be useful if we want to look up what license something has (perhaps automatically). Problem about claslsification remains (free? redistributable?) but that can be done by a human if need be.
In most cases the classification is trivial (no need to set anything extra, as it's free which implies redistributable). If we link to direct version, the link will disappear when they update it. I'd link to generic version like [zlib]. That might get wrong if they update to a different version than we do and the project relicenses in-between, but that's such an improbable thing to happen...
[zlib] http://metadata.ftp-master.debian.org/changelogs/main/z/zlib/testing_copyright
Also worth having a look at the devscripts package from Debian:
- licensecheck: attempt to determine the license of source files
A licensecheck -r <dir>
should spit out all the licenses of the source files. So it's similar to oss-license-extract mentioned by @cillianderoiste, but haven't tried/reviewed the latter yet.
Interesting. I wonder how tricky it would be to write our own script to check for licenses and spit out the correct attribute from lib/licenses.nix. IMO checking the headers of each file to audit the license is overkill for the sake of setting the metadata attribute, but perhaps we could just check for standard license files and go with that.
Well, it's important to get the right license on whether it seems like overkill or not. Many projects only use headers to indicate copyright so it'd at least have to be a fallback.
silver_hook mentioned SPDX on IRC, which looks exhaustive: http://spdx.org/licenses/ perhaps we should adopt these identifiers? These are also used in the appstream tag for metadata_license: http://www.freedesktop.org/software/appstream/docs/chap-Metadata.html
+1 for adopting those identifiers.
Looks reasonable but I notice it separates X11 and MIT whereas we don't so that's something to look out for
Given that there is a standard for short identifiers (http://spdx.org/licenses/), I got an idea.
I think the point of using lib.licenses.* attributes (over free form text) were 1) to prevent typos and 2) to attach metadata to licenses. With a standard identifier set, there is no need for attaching metadata in nixpkgs, that metadata can be managed externally. The remaining issue is preventing license typos. This can be done by adding validLicenses = [ <all_spdx.org_short_identifiers> ]
to lib/licenses.nix and checking that meta.license appears in validLicenses.
Thoughts?
Well, I don't see any advantage of dropping the current style in favor of the plain strings (again). Also, allowing attrsets is more flexible w.r.t. non-standard licenses.
Plain strings have the disadvantage of everyone writing them down in their own way which is why we now have about 15 ways in which BSD is specified. This makes it very hard to ask for what packages are under BSD.
I’d very much suggest sticking to SPDX identifiers (if we can’t supply the full RDF), since these are basically the de facto standard.
I don’t know how useful it is to this specific issue, but here’s a list of FS license detection tools:
A spdx naming PR: #3408.
I think this has been resolved.
Well, there is still more work to do but the ‘mass-replace’ part has been finishsed (for the most part).
If we currently grep through the package tree, many of the license fields are using strings. Perhaps it'd be worth-while to run sed over the tree to substitute the obvious fields with
licenses.…
equivalents.