dandi / dandi-schema

Schemata for DANDI archive project
Apache License 2.0
7 stars 10 forks source link

Exclude tests from wheel #249

Closed penguinpee closed 1 month ago

penguinpee commented 1 month ago

It's generally not desired to install tests together with the module. This change does just that and keeps everything else unchanged. Tests are still included in the sdist.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.89%. Comparing base (6fc16b8) to head (37f7505).

:exclamation: There is a different number of reports uploaded between BASE (6fc16b8) and HEAD (37f7505). Click for more details.

HEAD has 121 uploads less than BASE | Flag | BASE (6fc16b8) | HEAD (37f7505) | |------|------|------| |unittests|151|30|
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #249 +/- ## ========================================== - Coverage 97.75% 91.89% -5.87% ========================================== Files 16 16 Lines 1739 1739 ========================================== - Hits 1700 1598 -102 - Misses 39 141 +102 ``` | [Flag](https://app.codecov.io/gh/dandi/dandi-schema/pull/249/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dandi) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/dandi/dandi-schema/pull/249/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dandi) | `91.89% <ø> (-5.87%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dandi#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

yarikoptic commented 1 month ago

It's generally not desired to install tests together with the module.

why? Does it solve any problem?

it kinda introduces some though:

And so far I personally did not spot any problem being solved, or convenience being introduced, so not sure if a worthwhile effort.

penguinpee commented 1 month ago

Personally, I think that, if we're going to exclude tests from the wheel (which I support), we should go all the way and move tests/ outside of dandischema/ — but @yarikoptic doesn't like either of these practices.

Well, I looked at that option as well. But that's complicated by the fact that there are tests in dandischema/digests/tests as well.

why? Does it solve any problem?

  1. For starters it makes the wheel smaller. It might not be significant in this case, but I've packaged projects with hundreds of tests + test data.
  2. They are not required as part of the installed package. I ask myself: Does it make sense to do from dandischema import tests? Moreover, if you were using automatic discovery tests would be excluded by default. Users don't need them.
  3. In Fedora, we can optionally run an import test on the installed package. That's helpful in discovering if any dependencies have been missed (e.g. not declared). For dandischema, in it's current configuration, that throws an error (or false positive) on pytest[^1].
  • becomes difficult to test software as installed: need to dig out tests somewhere, match version etc.
  • edit: if moved outside of sources, not so easy to test a component of software I just changed unless tests hierarchy replicates original code hierarchy

I think both points very much depend on how you run tests and how your dev environment is set up.

In the end, it's a suggestion (based on common practice). But the decision is entirely up to you. I am / we are happy to carry it as a downstream only patch.

[^1]: Sure, I can exclude importing *tests*, but, again, those aren't required for using dandischema.

yarikoptic commented 1 month ago

3. In Fedora, we can optionally run an import test on the installed package. That's helpful in discovering if any dependencies have been missed (e.g. not declared). For dandischema, in it's current configuration, that throws an error (or false positive) on pytest1.

We arrived at the motivation/rationale, which IMHO boils down to: an installed component (dandischema.tests and alike) to import requiring a dependency which is typically not installed/not needed for a user installation. ok, let me sit on this for a bit (autopkgtest for Debian below might add weight).

FWIW, for Debian packages (I am yet to package this one though), I usually enable not just import test but running all discovered tests during package build time. Then pytest should be installed etc for that purpose. And I do then install the tests but do not add pytest into installation dependencies. Nevertheless a person could use autopkgtest (if package has needed config) (see https://people.debian.org/~eriberto/README.package-tests.html for more) to run those tests for that version of package and thus testing it against its dependencies (not necessarily the versions etc present during package build time).

yarikoptic commented 1 month ago

FWIW, even though I now see rationale, @penguinpee, if you could share some pointers to support the common practice and generally "claims", I would appreciate that!

penguinpee commented 1 month ago

I think the most compelling argument for not including tests can be made by looking at what setuptools does when using automatic discovery. I already linked to the documentation, but here it is again, explicitly:

https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#flat-layout

You have FlatLayoutPackageFinder.DEFAULT_EXCLUDE and FlatLayoutModuleFinder.DEFAULT_EXCLUDE for excluding packages and modules respectively. Both list test and spelling variations thereof. Above the lists it states:

To avoid confusion, file and folder names that are used by popular tools (or that correspond to well-known conventions, such as distributing documentation alongside the project code) are automatically filtered out in the case of flat-layout

(emphasis mine)

Note that automatic discovery only works when packages or py_modules are not provided. In case of using packages = find: or find_namespace: and a flat layout with tests/, consider what would happen if you were not to filter out tests. It would get installed as a top level module. That's a packaging error, but not uncommon. It's very likely (one of) the rational(s) for filtering it out. I think it's unfortunate that find_packages() and find_namespace_packages() do not use the same filters.

jwodder commented 1 month ago

@penguinpee setuptools' automatic discovery only filters out top-level tests/ trees, while tests/ inside a package (like dandischema has) is still included.

penguinpee commented 1 month ago

A less compelling argument might be PyPA's sampleproject which uses automatic discovery with a src-layout. By placing tests/ outside src/ it will be automatically excluded from the wheel.

penguinpee commented 1 month ago

@penguinpee setuptools' automatic discovery only filters out top-level tests/ trees, while tests/ inside a package (like dandischema has) is still included.

You are correct. I misunderstood that part. It indeed only filters on the top-level names.

yarikoptic commented 1 month ago

some pointers to support the common practice and generally

@jwodder , IIRC you had/know of some kind of index of pip packages, am I right? I wonder if there is a listing of contents of source packages/distribution to get some stats on how common is to ship tests within sources, within package_source, within "binary" builds. I can only look at examples of some most popular/core packages, e.g.

numpy - within `numpy` package, and shipped in source distribution, (edit) and shipped in .whl ```shell ❯ find -iname tests ./numpy/_core/tests ./numpy/_pyinstaller/tests ./numpy/compat/tests ./numpy/distutils/tests ./numpy/f2py/tests ./numpy/fft/tests ./numpy/lib/tests ./numpy/linalg/tests ./numpy/ma/tests ./numpy/matrixlib/tests ./numpy/polynomial/tests ./numpy/random/tests ./numpy/testing/tests ./numpy/tests ./numpy/typing/tests ``` ```shell ❯ tar -tzvf <(curl --silent https://files.pythonhosted.org/packages/54/a4/f8188c4f3e07f7737683588210c073478abcb542048cf4ab6fedad0b458a/numpy-2.1.0.tar.gz) | grep 'tests/test_.*\.py' | head -rw-r--r-- 0/0 5032 2024-07-08 08:34 numpy-2.1.0/numpy/_core/src/common/pythoncapi-compat/tests/test_pythoncapi_compat.py -rw-r--r-- 0/0 27472 2024-07-08 08:34 numpy-2.1.0/numpy/_core/src/common/pythoncapi-compat/tests/test_upgrade_pythoncapi.py -rw-r--r-- 0/0 2881 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test__exceptions.py -rw-r--r-- 0/0 2221 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_abc.py -rw-r--r-- 0/0 22932 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_api.py -rw-r--r-- 0/0 2824 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_argparse.py -rw-r--r-- 0/0 3062 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_array_api_info.py -rw-r--r-- 0/0 34509 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_array_coercion.py -rw-r--r-- 0/0 7767 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_array_interface.py -rw-r--r-- 0/0 3264 2024-08-18 13:20 numpy-2.1.0/numpy/_core/tests/test_arraymethod.py ``` ```shell ❯ unzip -l numpy-2.1.0-pp310-pypy310_pp73-win_amd64.whl | grep 'tests/test_.*\.py' | head 2853 2024-08-18 17:32 numpy/distutils/tests/test_build_ext.py 29586 2024-08-18 17:32 numpy/distutils/tests/test_ccompiler_opt.py 6523 2024-08-18 17:32 numpy/distutils/tests/test_ccompiler_opt_conf.py 7612 2024-08-18 17:32 numpy/distutils/tests/test_exec_command.py 1320 2024-08-18 17:32 numpy/distutils/tests/test_fcompiler.py 2191 2024-08-18 17:32 numpy/distutils/tests/test_fcompiler_gnu.py 1088 2024-08-18 17:32 numpy/distutils/tests/test_fcompiler_intel.py 1124 2024-08-18 17:32 numpy/distutils/tests/test_fcompiler_nagfor.py 1147 2024-08-18 17:32 numpy/distutils/tests/test_from_template.py 902 2024-08-18 17:32 numpy/distutils/tests/test_log.py ```
jwodder commented 1 month ago

you had/know of some kind of index of pip packages, am I right?

Not one that covers sdists.

I wonder if there is a listing of contents of source packages/distribution to get some stats on how common is to ship tests within sources, within package_source, within "binary" builds.

If you really want to investigate this, I suggest asking on https://discuss.python.org/c/packaging/14.

penguinpee commented 1 month ago

I didn't mean to send you on a wild goose chase. If you prefer including the tests in the wheel, by all means do. As I stated before, we can carry this PR as a downstream only patch. Matter of fact we already do.

Quite often when I see tests shipped in the wheel it turns out to be an oversight. Sometimes it is a genuine mistake, e.g. when they are shipped as a separate module.

I have also come across projects that ship them neither in the wheel nor in the sdist. In such cases we usually fall back to acquiring the sources from the upstream forge, because we do want to run those tests as part of the build if possible within reasonable effort.

yarikoptic commented 1 month ago

Thank you @penguinpee ! For now I would just prefer to keep current practice among our projects, unless it would be prohibitive (e.g. large test data requirements etc) which would warrant adding separation of tests from the main code sources. Who knows -- I might be convinced in the opposite later on, so I will keep in mind this case of "import tests". Cheers!