pilosus / pip-license-checker

Check license types for third-party dependencies: permissive, copyleft, proprietory, etc.
https://blog.pilosus.org/posts/2021/09/07/pip-license-checker/
Other
67 stars 3 forks source link

Handling of libraries that have multiple licenses #141

Open stealthrabbi opened 1 year ago

stealthrabbi commented 1 year ago

Some libraries indicate multiple licenses. For example pyzmq has multiple. I am not entirely sure what this means, but perhaps it means as a user of the library, I can choose whichever license I want to apply? At an rate, pip-license-checker seems to look at all of them and choose the least permissive as the option. Is it possible to specify a control flag to "choose permissive" for libraries that have multiple licenses, such that for this case, it would choose the BSD license and thus be permissive?

pyzmq:25.0.0 BSD License, GNU Library or Lesser General Public License (LGPL) WeakCopyleft

pilosus commented 1 year ago

Hey @stealthrabbi

mandatory "I am not a lawyer", but licensing and multi-licensing is a complex thing. When you see multiple licenses for a package it may mean many different things:

You cannot really tell apart those cases from the Python package meta data about the license. There is a really nice initiative wrapped up in PEP 639 about proper use of so called SPDX expressions (unique license ids + logic ops to write expressions) in Python package meta that will help to eliminate the ambiguity of (multi-)licensing for a package. This is a great read, I encourage you to read it if you are interested in the topic.

But before the SPDX expressions are introduced, there's no way to understand the licensing other than investigating the package in question. Usually, you need to go to the package repo and find a file like COPYING or LICENSE or LICENCE and read it :-)

If a package author is merciful enough, you will see a human-readable and understandable summary, like here. If not, you may need to dig deeper, like here.

Since the checker cannot really do this stuff for you (someone needs to submit a PR with ChatGPT to solve that haha), it tries to stay safe, picking up the most copylefty license for the multi-licensing cases.

My personal flow for cases like that:

  1. Multi-licensing detected with the license category I cannot use in my project
  2. I go and investigate the effective T&Cs of the project
  3. If after that I can discount the risks, I add the package to the exceptions with an option --exclude '^(package_a|package_b|...).*'
  4. I also keep the records of the exclusions from the p.3 somewhere (like spreadsheet with the timestamp for the next "reaccreditation") to check the package for the cases like changing the license

Hope that helps