Open jayvdb opened 5 years ago
A useful primer is https://velovix.github.io/post/lgpl-gpl-license-compliance-with-pyinstaller/
A fairly significant problem is likely to exist in most projects, as chardet
is LGPL. It is included in the top 20 downloaded Python packages.
https://github.com/chardet/chardet/issues/162 is an open issue about which LGPL is in play there. https://github.com/chardet/chardet/issues/36 and https://github.com/chardet/chardet/issues/27 also discuss it.
https://github.com/psf/requests/issues/3389 has many people explaining that requests (APL) should not be bundling chardet inside it, but it was resolved because requests dropped its bundling. The comments about using with pyinstaller are still relevant and unresolved. https://github.com/iterative/dvc/issues/1115 is another discussion about chardet, but again the assumption that packages are dynamically linked means LGPL's problems are not a priority.
Over at pyinstaller, they seem to be unaware of this issue also https://github.com/pyinstaller/pyinstaller/issues/664
Being able to have LGPL packages loaded dynamically is probably going to be necessary.
On my site-packages
> pip-licenses | grep LGPL
Brlapi 0.7.0 LGPL 2.1
CairoSVG 2.4.0 LGPLv3+
Glances 3.1.0 LGPLv3
GridDataFormats 0.4.0 LGPLv3
PyGObject 3.32.1 GNU LGPL
PyWebDAV3 0.9.12 LGPL v2
amqplib 1.0.2 LGPL
ansi2html 1.5.2 LGPLv3+
argh 0.26.2 GNU Lesser General Public License (LGPL), Version 3
astroid 2.2.5 LGPL
chardet 3.0.4 LGPL
cluster 1.4.1 LGPL
colormap 1.0.1 LGPL
css-parser 1.0.4 LGPL 2.1 or later
cssutils 1.0.2 LGPL 2.1 or later, see also http://cthedot.de/cssutils/
datrie 0.7.1 LGPLv2+
demjson 2.2.4 GNU LGPL 3.0
discid 1.2.0 LGPLv3+
dominate 2.3.5 LGPLv3
ephem 3.7.6.1 LGPL
fann2 1.1.2 GNU LESSER GENERAL PUBLIC LICENSE (LGPL)
fedmsg 1.1.1 LGPLv2+
flake8-import-order 0.18.1 LGPLv3
gprof2dot 2017.9.19 LGPL
guessit 3.0.4 LGPLv3
img2pdf 0.3.3 LGPL
junitxml 0.7 LGPL-3
jwcrypto 0.6.0 LGPLv3+
keepalive 0.5 GNU LGPL
kitchen 1.2.5 LGPLv2+
lazr.uri 1.0.3 LGPL v3
ldap3 2.6 LGPL v3
liblarch 3.0.0 LGPLv3
llfuse 1.3.6 LGPL
logilab-astng 0.24.3 LGPL
logilab-common 1.4.1 LGPL
moretools 0.1.9 LGPLv3
nose 1.3.7 GNU LGPL
nose-cover3 0.1.0 GNU LGPL
nose-exclude 0.5.0 GNU LGPL
num2words 0.5.10 LGPL
paramiko 2.4.2 LGPL
piston-mini-client 0.7.5 LGPLv3
psycopg2 2.7.7 LGPL with exceptions or ZPL
pycha 0.7.0 LGPL 3
pycountry 18.12.8 LGPL 2.1
pycurl 7.43.0.2 LGPL/MIT
pydbus 0.6.0 LGPLv2+
pyenchant 2.0.0 LGPL
pygal 2.4.0 GNU LGPL v3+
pygame 1.9.4 LGPL
pygpgme 0.3 LGPL
pylama 7.7.1 GNU LGPL
pylzma 0.0.0.dev0 LGPL
pymssql 2.1.4 LGPL
pyo 0.9.1 LGPLv3+
pysndfile 1.3.2 LGPL
python-dbusmock 0.18.1 LGPL 3+
python-digitalocean 1.13.2 LGPL v3
python-gitlab 1.8.0 LGPLv3
python-mpd2 1.0.0 GNU Lesser General Public License v3 (LGPLv3)
python-nss 1.0.1 MPLv2.0 or GPLv2+ or LGPLv2+
python-stdnum 1.11 LGPL
python-vlc 3.0.6109 LGPL-2.1+
python-xlib 0.25 LGPLv2+
pytlv 0.71 LGPLv2
pyudev 0.21.0 LGPL 2.1+
pyzmq 18.0.1 LGPL+BSD
scp 0.13.2 LGPL
ssdeep 3.3 LGPLv3+
stem 1.7.1 LGPLv3
systemd-python 234 LGPLv2+
tld 0.9.3 MPL 1.1/GPL 2.0/LGPL 2.1
urlgrabber 4.0.0 LGPLv2+
urwid 2.0.1 LGPL
validate-email 1.3 LGPL
virtkey 0.63.0 LGPL
wadllib 1.3.3 LGPL v3
websockify 0.8.0 LGPLv3
xdot 1.0 LGPL
zeroconf 0.23.0 LGPL
zetup 0.2.48 LGPLv3
> pip-licenses | grep GPL | grep -v LGPL
AnyQt 0.0.10 GPLv3
GooCalendar 0.5 GPL-2
IMDbPY 6.6 GPL
PyBluez 0.22 GPL
PyInstaller 3.5 GPL license with a special exception which allows to use PyInstaller to build and distribute non-free programs (including commercial ones)
PyPrint 0.2.7.dev0 GPL v3
PyRIC 0.1.6.3 GPLv3
PyX 0.14.1 GPL
Pymacs 0.25 GPLv2
ReText 7.0.4 GPL 2+
Unidecode 1.1.0 GPL
ViTables 3.0.0 GPLv3, see the LICENSE.txt file for detailed info
apparmor 2.13.2 GPL-2
arabic-reshaper 2.0.14 GPL
audiolazy 0.6 GPLv3
bucket 0.20181030 GPL v2 or later
buku 4.2.2 GPLv3
caldav 0.6.1 GPL
ciscoconfparse 1.3.32 GPLv3
coala 0.12.0.dev99999999999999 AGPL-3.0
coala-bears 0.12.0.dev99999999999999 AGPL-3.0
commodity 0.20190214 GPLv3
cptrace 0.6.1 GNU GPL v2
cracklib 2.9.3 GPLv2+
dependency-management 0.5.0.dev0 AGPL-3.0
docutils 0.14 public domain, Python, 2-Clause BSD, GPL 3 (see COPYING.txt)
dulwich 0.19.11 Apachev2 or later or GPLv2
easypysmb 1.4.3 GPL3
esptool 2.6 GPLv2+
ethtool 0.14 GPL-2.0
exrex 0.10.5 AGPLv3+
fastimport 0.9.8 GNU GPL v2 or later
hitchbuild 0.5.1 AGPL
hitchkey 0.5.0 AGPL
hitchrun 0.3.2 AGPL
html2text 2018.1.9 GNU GPL 3
iwlib 1.6.2 GPLv2
meld 3.20.1 GPLv2+
mutagen 1.42.0 GPL-2.0-or-later
mysql-connector-python 2.1.7 GNU GPLv2 (with FOSS License Exception)
mysqlclient 1.4.2 GPL
nfoview 1.26 GPL
ntfy 2.7.0 GPLv3
odorik 0.5 GPLv3+
onionshare 2.0 GPL v3
openqa-client 1.3.0 GPLv2+
osc 0.165.4 GPL
passivetotal 1.0.30 GPLv2
pefan 0.1.2a0 GPLv3
pelican 4.0.1 AGPLv3
prettystack 0.3.0 AGPL
py3exiv2 0.6.1 GPL-3
pycdio 2.0.0 GPL
pycups 1.9.74 GPLv2+
pyfeyn 1.0.0 GPL
pygit2 0.28.1 GPLv2 with linking exception
pykeepass 3.0.3 GPL3
pylint 2.3.1 GPL
pymad 0.10 GPL
pymilter 1.0.4 GPL
pyocr 0.7 GPLv3+
pyparted 3.11.1 GPLv2+
pyprel 2018.9.14.1501 GPLv3
pyroute2 0.5.6 dual license GPLv2+ and Apache v2
pysmbc 1.0.15.8 GPLv2+
pysrt 1.1.1 GPLv3
pytaglib 1.4.5 GPLv3+
pytesseract 0.2.6 GPLv3
python-Levenshtein 0.12.0 GPL
python-axolotl 0.1.42 GPLv3 License
python-axolotl-curve25519 0.4.1.post2 GPLv3 License
python-bugzilla 2.2.0 GPLv2
python-datamatrix 0.9.14 GNU GPL Version 3
python-distutils-extra 2.38 GNU GPL
python-djvulibre 0.8.4 GNU GPL 2
python-espeak 0.5 GPL
python-linux-procfs 0.6 GPLv2
python-lzo 1.12 GNU General Public License (GPL)
python-mpv 0.3.9 AGPLv3+
python-ptrace 0.9.3 GNU GPL v2
qet-tb-generator 1.0.16 GPL
qutebrowser 1.6.2 GPL
relatorio 0.8.1 GPL License
rencode 1.0.6 GPLv3
rfc3987 1.3.8 GNU GPLv3+
rope 0.14.0 GNU GPL
rpy2 2.9.5 GPLv2+
rt 1.0.11 GNU General Public License (GPL)
scikit-sparse 0.4.4 GPL
scspell3k 2.2 GPL 2
shijian 2018.6.2.1644 GPLv3
sortinghat 0.4.3 GPLv3
subgrab 0.17 GPL
technicolor 2017.1.16.1544 GPLv3
transifex-client 0.12.4 GPLv2
txZMQ 0.8.0 GPLv2
urlscan 0.9.2 GPLv2
veusz 3.0.1 GPL
yamllint 1.15.0 GPLv3
yarb 1.0.0 GPL2.0
yip 1.2.6 GPLv3
psycopg2 and paramiko are also very prominent packages that are LGPL.
Unidecode looks to be the GPL package most likely to effect many projects.
Thanks for all the detailed research! I'll need to look into matters further.
I generally agree that having license linting for Python packages (to match what we have for the C extensions and libraries) would be a good feature to have.
Maybe something like the pylicense tool can be helpful to gather the licenses of dependencies in the environment.
Related: #268
chardet
is a big issue since it's used as a dependancy of requests
, among other things.
charset_normalizer
could be an alternative if packages migrated to it.
The inclusion of all licenses for the Python runtime components is very nice.
However, the licenses of the Python packages built into the binary is just as important - possibly even more so because the licenses were often chosen assuming the package would not be linked/embedded into a larger work, and there is less appreciation of those aspects of license chooses in the Python world because the source is normally the executable/redistributable.
Where PKG-INFO or EGG-INFO exists, which should be most of the time, there is a License free text field which looks like
License: MIT
. Sometimes it contains SPDX compatible names, other times it is ambiguous likeLicense: BSD
. The Trove classifiers are also ambiguous, e.g.License :: OSI Approved :: BSD License
. And I have seen quite a lot of cases where a package has discrepancies between the trove classifiers,License:
andLICENSE.txt
.https://pypi.org/project/pip-licenses/ looks like it is quite useful to sort out that mess.
The wheels usually now contain a LICENSE file. It can be explicit with the following in setup.cfg , however modern setuptools now aggressively finds and includes one if it exists when building a wheel (and possibly also when building an sdist).
IMO a good solution would be to try to get the license text file out of the wheel or sdist, and error/warn bitterly if one was not locatable, rather than playing games with the PKG-INFO/EGG-INFO text field, which is still insufficient if the license any that requires redist of the license text including custom notices, such as Apache-2.0.
IMO it is easy to get Python projects to add a LICENSE file, even when the project is otherwise moribund. Getting a new package release with the LICENSE might be more difficult, but often in that case a
git+https://
ormaster.zip
requirement solves the problem.(I assume those can be used with PyOzidizer)The filtering on licenses should also apply to Python packages, and probably more errors/bitter warnings needed there. It would also be wise to have a default filter in place for GPL, so PyOzidizer errors unless users explicitly allow GPL packages in the project toml file.