purcell / package-lint

A linting library for elisp package metadata
GNU General Public License v3.0
195 stars 34 forks source link

Lint License and SPDX-License-Identifier headers #209

Open lassik opened 3 years ago

lassik commented 3 years ago

Implements issue #83

Here's a first cut of the linter. Rules very much subject to debate:

This patch doesn't detect the presence or absence of the GPL boilerplate in the headers. How important is that?

lassik commented 3 years ago

cc @tarsius who is very experienced with elisp licensing

tarsius commented 3 years ago

This patch doesn't detect the presence or absence of the GPL boilerplate in the headers. How important is that?

Extremely important.

  1. The FSF instructs users of the license to use these permission statements to make their choice known.
  2. The SPDX folks recommend that the spdx id is added in addition to the permission statement:

Standard license headers

When a license defines a recommended notice to attach to files under that license (sometimes called a “standard header”), the SPDX project recommends that the standard header be included in the files, in addition to an SPDX ID.

Additionally, when a file already contains a standard header or other license notice, the SPDX project recommends that those existing notices should not be removed. The SPDX ID is recommended to be used to supplement, not replace, existing notices in files.

Like copyright notices, existing license texts and notices should be retained, not replaced ‐ especially a third party’s license notices.

The SPDX project’s statement regarding standard license headers and SPDX short-form identifiers can be found at Appendix V of the SPDX specification, version 2.1.

https://spdx.dev/ids

tarsius commented 3 years ago

I've also worked on this a bit over the weekend. The Emacsmirror/elx.el now uses SPDX IDs if possible.

I copied the recommended license list from the issue, but it's probably not quite broad enough.

https://emacsmirror.net/stats/licenses.html#orgaf6e078 is another list (it also includes non-spdx identifiers and I just noticed that ZLIB has to be normalized to Zlib. There may be other issues like that.

purcell commented 3 years ago

Nice, @lassik! Agree with @tarsius that we should ideally recognise the recommended GPL boilerplate typically inserted by auto-insert: if it's there, but there are no formal licence headers, then we should still accept this. (This is largely for backwards compatibility.)

Beyond that,

Do we have the authority to require a SPDX-License-Identifier: header? I'd say it's okay because it's just a warning.

Sure, either that or Licen[cs]e:. Again I'd ideally accept the latter for general backwards compatibility in common cases but suggest the SPDX- form when necessary. And parsing any more than a handful of non-SPDX strings from License: wouldn't be required. Just thinking that License: GPLv3 is probably good enough, for example.

lassik commented 3 years ago

OK, GPL and LGPL boilerplate detection now added. There are probably subtle bugs in it, but seems to work.

lassik commented 3 years ago

The current code needs a SPDX-License-Identifier: to figure out which version of GPL or LGPL is being used.

Things would be easier for us on all fronts if we could require that header to always be present. Is there any harm from adding it, apart from the fact that GNU doesn't require it? The number of people using weird licenses not in SPDX is probably tiny.

lassik commented 3 years ago

The current code in this PR doesn't check for absence of at your option any later version when using GPL-x.x-only.

But the -only licenses shouldn't be recommended, as they are incompatible with Emacs versions released under a later GPL.

lassik commented 3 years ago

@tarsius Thanks for your work on https://emacsmirror.net/stats/licenses.html.

Isn't GPL-3.0-only going to be problematic if future GPL versions are made? There's surprisingly many packages using it: 183. How do you detect GPL-x.x-only vs GPL-x.x-or-later?

tarsius commented 3 years ago

Isn't GPL-3.0-only going to be problematic if future GPL versions are made? There's surprisingly many packages using it: 183. How do you detect GPL-x.x-only vs GPL-x.x-or-later?

Most of those are probably due to elx finding no information in addition to the license file.

I suspect that most of the authors who only use a license file do so because adding the permission statement is "unnecessary, noisy and ugly". Many would probably not object to GPL-3.0-or-later if they were made aware of the issue. They might still have reservations about the permission statements; so telling them about the less noisy SPDX-License-Identifier: GPL-3.0-or-later might help a lot.

lassik commented 3 years ago

Could we convince GNU to bless SPDX-License-Identifier as the official way to specify the license for elisp packages?

tarsius commented 3 years ago

I don't know about GNU and the FSF, but maybe emacs-devel. The former two surely are already aware of SPDX and so far they have not embraced it. But maybe the stars would be aligned if you brought it up now. Such things usually take a few attempts. (I would focus on just emacs-devel though.)