MatthiasValvekens / pyHanko

pyHanko: sign and stamp PDF files
MIT License
483 stars 71 forks source link

LICENSE.PyPDF2 missing from wheel distributions #335

Closed stefan6419846 closed 9 months ago

stefan6419846 commented 10 months ago

Describe the bug

The wheel distributions do not contain the https://github.com/MatthiasValvekens/pyHanko/blob/master/pyhanko/pdf_utils/LICENSE.PyPDF2 file.

To Reproduce

Download the wheel file from https://pypi.org/project/pyHanko/#files

Expected behavior

The corresponding license file is part of the wheel distribution due to employing a stricter license than pyHanko itself (MIT vs. BSD-3-Clause).

Additional context

By the way, PyPDF2 development has been discontinued in favor of pypdf which received lots of fixes: https://pypi.org/project/PyPDF2/ I am not sure how pyHanko might benefit from this, but I just wanted to point this out.

MatthiasValvekens commented 10 months ago

Hi,

This wasn't an intentional omission, but I'd like to avoid polluting the "business" section of the wheel with data files that have no runtime relevance. Putting them in the metadata section (i.e. under pyHanko*.dist-info) would be OK as far as I'm concerned, though, as that's where all the metadata files (including pyHanko's own licence file) live. Having said that, I'm not sure if there's an easy way to hook into that using setuptools config, and whether you're even supposed to put extra stuff in there. I'll look into it; perhaps just reproducing the upstream licence in the readme is the easiest solution.

Regarding the fate of PyPDF(2): I'm well aware, and I've been in contact with the PyPDF maintainer about this before. See in particular #127 ;). Long story short, from a purely technical perspective there's nothing to update in pyHanko as such: actually very little "original" code from PyPDF2 survives in any recognisable form in pyHanko at this point.

stefan6419846 commented 10 months ago

Thanks for your fast reply. I am completely fine with adding the second license file to the distribution info metadata directory. Some other projects like opencv-python are already doing it in a similar fashion. (I am not sure if there is a non-hacky way to include the license file usually within the package into the metadata directory, but it should work if moved to the top-level directory and added to the MANIFEST.in file.)

I would argue against putting the license into the README only though, as this tends to make license compliance for library users harder. With dedicated license files, it is easy to just use them verbatim, while within the README some further post-processing might be necessary.

MatthiasValvekens commented 9 months ago

I looked into this further. For now, I dealt with the issue by providing an explicit license-files directive in pyproject.toml. This way of dealing with it is not 100% standard--the method is specific to setuptools--but the result matches what opencv-python does (thanks for that pointer, by the way!).

Once PEP 639 materialises, there'll probably be a more robust way to solve this particular problem.

Thanks again for flagging this!