py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
8.22k stars 1.4k forks source link

Please include tests in pypi distfile #2864

Closed 0-wiz-0 closed 3 weeks ago

0-wiz-0 commented 4 weeks ago

Explanation

I'm packaging pypdf for pkgsrc. We have standardized on using the pypi source tar.gz files. When updating packages, I like to run the test suite to make sure the software works fine. The tests are currently not included, please include them in the pypi source file to allow this.

Thank you!

stefan6419846 commented 4 weeks ago

Thanks for the report - I honestly have not been aware that we are not even distributing the tests itself in our sdists. Nevertheless, I am not sure what would be the best approach here either for the PDF files itself - these currently originate from the repository itself (11.3 MB), the sample-files repository (14.5 MB) and downloaded from the internet (290 MB).

0-wiz-0 commented 3 weeks ago

Hm, perhaps include the tests/ directory (including the scripts) in the tarball, and provide the sample-files and downloads as one or two separate distfiles?

stefan6419846 commented 3 weeks ago

PyPI does not allow distributing (pure) data files. In my opinion, bundling the resources and tests directory inside the sdist should be doable. For the sample-files repository, I am not sure, but there are not much changes and I consider it easy enough to download the archive from the git commit from GitHub. The remaining 290 MB will most likely never be distributed by us, as they can be downloaded with a single command or skipped entirely - depending on how your build machines are handling network requests.

MartinThoma commented 3 weeks ago

The reason why I excluded tests is to make pypdf small so that downloads in CI are fast.

I don't want to include tests in wheels. For source distribution, I don't have a strong opinion.

stefan6419846 commented 3 weeks ago

It is rather uncommon to bundle tests in the binary wheels, but for the sdists its rather usual. The request is about the "source tar.gz files", thus the sdists which should match your opinion.

pubpub-zz commented 3 weeks ago

@0-wiz-0 For ecological reason (preventing multiple copy of the data) I would have prefered that you get the source directly from github : just the /tests will not be sufficient to run the tests and the amount of data is quite important as @stefan6419846 said.

If you maintain your request, can you please propose a PR that will add tests to .tar.gz but keep .whl unchanged

stefan6419846 commented 3 weeks ago

We explicitly ignore the test directory at the moment: https://github.com/py-pdf/pypdf/blob/dcd15aaf50faa16a0b361d834e10ebe7c56dfc97/pyproject.toml#L65 Removing it from there might already fix the issue (we do not have to care about the JSON file in my opinion, as make_release.py is never distributed and thus the corresponding tests are always skipped). To bundle the resources directory, we would have to remove it from the same list, but adapt MANIFEST.in as well to include these files.

pubpub-zz commented 3 weeks ago

Removing it from there might already fix the issue (we do not have to care about the JSON file in my opinion, as make_release.py is never distributed and thus the corresponding tests are always skipped). To bundle the resources directory, we would have to remove it from the same list, but adapt MANIFEST.in as well to include these files.

Maybe I'm wrong but if we juste remove it, wouldn't the tests folder be added to the whl ?

0-wiz-0 commented 3 weeks ago

Line 65 is in the 'sdist' section, so I'd assume it only affects the source distribution.

pubpub-zz commented 3 weeks ago

I can not see any sectIon about wheel. From what I've read flit seems to output both tar.gz and whl by default at the same time.

pubpub-zz commented 3 weeks ago

Update: On me!🤭🤭 I did a check using flit build and the whl is not modified.

stefan6419846 commented 3 weeks ago

Maybe I'm wrong but if we juste remove it, wouldn't the tests folder be added to the whl ?

No, you would have to mark the tests as a module as well to bundle them inside the binary wheel.