pypa / twine

Utilities for interacting with PyPI
https://twine.readthedocs.io/
Apache License 2.0
1.61k stars 308 forks source link

Fix/mitigate flaky integration tests #684

Closed bhrutledge closed 3 months ago

bhrutledge commented 4 years ago

As I write this, the checks on master are currently failing on Travis and GitHub actions due to test_integration.py.

https://travis-ci.org/github/pypa/twine/jobs/718433234 https://github.com/pypa/twine/runs/991209553

I've noticed this intermittently since the tests were added, but it seems to be more of a problem lately.

TODO:

bhrutledge commented 4 years ago

Following up on questions from @sigmavirus24 in https://github.com/pypa/twine/pull/703#issuecomment-703137485:

  • Why were we emulating a CLI call for an integration test?

  • What were we testing the integration of? As I understood it, these should be testing against some version of PyPI (TestPyPI, DevPI, etc.) but this is testing argparse as well ... Thus these are almost end-to-end tests (almost because they're not testing that setuptools builds the binary with an entrypoint correctly by shelling out to test things) and thus acceptance tests, not integration tests.

Wait, do we not all share the same definitions of "acceptance, "integration", and "unit" tests? Shocking! 😉

That said, these do feel like "acceptance"/"end-to-end"/"smoke" tests to me, and seem valuable in that capacity, as a way to simulate the user's experience and detect upstream issues.

  • Is testing exactly this way important, or are you okay with doing integration testing differently?

I'm definitely open ideas, and am not married to these tests. They were originally added by @jaraco, and my effort has been focused on keeping their value while reducing their costs, which I think manifest primarily as PR noise due to flakiness of the test servers. At best, that requires a contributor or maintainer to dive into the test results to discover what failed. At worst, it blocks a merge or release.

I'm keen to reduce the PR noise as soon as possible. Assuming we want some amount of testing against real servers, I'd rather not entirely disable these tests. If we do end up rewriting them, I think we'll still want them in place until that's done, which means they'll still be noisy. With that in mind, what do folks think about the "TODO" that I added to the issue description:

Use a separate GH Action job (and maybe tox testenv) for integration tests to clearly distinguish them from the "fast" tests

I think that'd be running pytest -k 'not integration' and pytest -k 'integration' separately, perhaps on different conditions (e.g., always run "fast" tests, only run "integration" tests on merge and/or schedule).

sigmavirus24 commented 4 years ago

I think that'd be running pytest -k 'not integration' and pytest -k 'integration' separately, perhaps on different conditions (e.g., always run "fast" tests, only run "integration" tests on merge and/or schedule).

:+1: Let's do that. I like that option as an interim step while we think a bit harder about this

bhrutledge commented 1 year ago

The tests against devpi and pypiserver are failing. I though it was related to tox>=4, but they're still failing with the same error with tox<4. Poking around the source of jaraco.envs, it looks like the tox environment setup isn't working as expected, because commands like devpi-init aren't being found.

I can reproduce this locally by running tox -e integration. I'm tempted to remove the dependency on jaraco.envs and coupling to tox, and try to isolate the venv setup as a fixture in test_integration.py.

jaraco commented 1 year ago

You’re welcome to proceed that way or assign to me and I can investigate the issue. Let me know.

bhrutledge commented 1 year ago

@jaraco It might be worth verifying that jaraco.envs.ToxEnv is working as expected, in isolation. In my initial local testing, an assertion that .tox/devpi exists after the install() was failing.

That said, it looks like it might be relatively straightforward to use venv.create() in conjunction with pytest's temporary directory fixture, which sounds like a fun exercise, that I think would simplify/clarify the tests.

As of https://github.com/pypa/twine/pull/956, the devpi and pypiserver tests are skipped until this is resolved.

jaraco commented 1 year ago

It's working fine for me:

 twine main $ pip-run jaraco.envs
Python 3.11.0 (main, Oct 26 2022, 19:06:18) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from jaraco.envs import ToxEnv
>>> class DevPi(ToxEnv):
...   name = 'devpi'
... 
>>> env = DevPi().create()
devpi: install_deps> python -I -m pip install devpi devpi-server
.pkg: install_requires> python -I -m pip install 'setuptools>=45' 'setuptools_scm[toml]>=6.0' wheel
.pkg: _optional_hooks> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: install_requires_for_build_editable> python -I -m pip install wheel
.pkg: get_requires_for_build_sdist> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
devpi: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0'
devpi: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/1/twine-4.0.3.dev12+g4aba39c.tar.gz
.pkg: _exit> python /var/folders/sx/n5gkrgfx6zd91ymxr2sr9wvw00n8zm/T/pip-run-goxxlf8x/pyproject_api/_backend.py True setuptools.build_meta
  devpi: OK (16.43 seconds)
  congratulations :) (16.50 seconds)
>>> env.exe('devpi').isfile()
True
>>> import subprocess
>>> subprocess.run([env.exe('devpi')])
usage: devpi [-h] [--version] [--debug] [-y] [-v] [--clientdir DIR]
             {use,getjson,patchjson,list,remove,user,passwd,login,logoff,logout,index,upload,test,push,install,refresh} ...

The devpi commands (installed via devpi-client) wrap common Python packaging, uploading, installation and testing activities, using a remote devpi-server managed
index. For more information see http://doc.devpi.net

positional arguments:
  {use,getjson,patchjson,list,remove,user,passwd,login,logoff,logout,index,upload,test,push,install,refresh}
    use                 show/configure current index and target venv for install activities.
    getjson             show remote server and index configuration.
    patchjson           send a PATCH request with the specified json content to the specified path.
    list                list project versions and files for the current index.
    remove              removes project info/files from current index.
    user                add, remove, modify, list user configuration.
    passwd              change password of specified user or current user if not specified.
    login               login to devpi-server with the specified user.
    logoff              log out of the current devpi-server.
    logout              log out of the current devpi-server.
    index               create, delete and manage indexes.
    upload              (build and) upload packages to the current devpi-server index.
    test                download and test a package against tox environments.
    push                push a release and releasefiles to an internal or external index.
    install             install packages through current devpi index.
    refresh             invalidates the mirror caches for the specified package(s).

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

generic options:
  --debug               show debug messages including more info on server requests
  -y                    assume 'yes' on confirmation questions
  -v, --verbose         increase verbosity
  --clientdir DIR       directory for storing login and other state
CompletedProcess(args=[Path('.tox/devpi/bin/devpi')], returncode=0)
>>> 
>>> subprocess.run([env.exe('devpi-init')])
INFO  NOCTX Loading node info from /Users/jaraco/.devpi/server/.nodeinfo
INFO  NOCTX generated uuid: c505e7279f634f29970a2da02b2396b5
INFO  NOCTX wrote nodeinfo to: /Users/jaraco/.devpi/server/.nodeinfo
INFO  NOCTX DB: Creating schema
INFO  [Wtx-1] setting password for user 'root'
INFO  [Wtx-1] created user 'root'
INFO  [Wtx-1] created root user
INFO  [Wtx-1] created root/pypi index
INFO  [Wtx-1] fswriter0: committed at 0
CompletedProcess(args=[Path('.tox/devpi/bin/devpi-init')], returncode=0)

Moreover, I ran tox -e integration and the devpi tests passed. The pypiserver tests failed though. Can you confirm the devpi tests fail for you locally the way they are failing in CI?

That said, it looks like it might be relatively straightforward to use venv.create() in conjunction with pytest's temporary directory fixture, which sounds like a fun exercise, that I think would simplify/clarify the tests.

My motivation for using tox was because it provides the following functionality:

I'm not super-happy with the interfaces exposed by jaraco.envs, so I'm not surprised it's not straightforward to test/exercise the functionality.

bhrutledge commented 1 year ago

Can you confirm the devpi tests fail for you locally the way they are failing in CI?

Yep; tox -e integration fails with Tox 3 and Tox 4.

I have a branch that's passing after replacing jaraco.envs with fixture logic based on the venv module. I've got some refactoring to do, but will hopefully have a PR up this week.

jaraco commented 1 year ago

The issue seems to be that under some circumstances, the env used to install pypiserver is .tox/integration and not .tox/pypiserver as expected.

(Pdb) env.create()
.pkg: _optional_hooks> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_editable> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_wheel> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
pypiserver: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tox/.tmp/package/9/twine-4.0.3.dev11+g815853f.d20221214.tar.gz
.pkg: _exit> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
  pypiserver: OK (3.73 seconds)
  congratulations :) (3.80 seconds)
<tests.test_integration.PypiserverEnv object at 0x104d7e790>
jaraco commented 1 year ago

I'm able to replicate the issue without jaraco.envs. The issue is either with tox or more likely with running tox from a virtualenv:

``` twine main $ tox -e integration --notest integration: install_deps> python -I -m pip install build coverage jaraco.envs munch portend pretend pytest pytest-rerunfailures pytest-services pytest-socket .pkg: install_requires> python -I -m pip install 'setuptools>=45' 'setuptools_scm[toml]>=6.0' wheel .pkg: _optional_hooks> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: get_requires_for_build_editable> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: install_requires_for_build_editable> python -I -m pip install wheel .pkg: get_requires_for_build_sdist> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: build_wheel> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: build_sdist> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta integration: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0' integration: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/1/twine-4.0.3.dev12+g4aba39c.tar.gz .pkg: _exit> python /Users/jaraco/.local/pipx/venvs/tox/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta integration: OK (10.81 seconds) congratulations :) (10.87 seconds) twine main $ .tox/integration/bin/python -m tox -e pypiserver --notest pypiserver: install_deps> python -I -m pip install pypiserver .pkg: _optional_hooks> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: get_requires_for_build_editable> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: get_requires_for_build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: build_wheel> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta .pkg: build_sdist> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta pypiserver: install_package_deps> python -I -m pip install 'importlib-metadata>=3.6' 'keyring>=15.1' 'pkginfo>=1.8.1' 'readme-renderer>=35.0' 'requests-toolbelt!=0.9.0,>=0.8.0' 'requests>=2.20' 'rfc3986>=1.4.0' 'rich>=12.0.0' 'urllib3>=1.26.0' pypiserver: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jaraco/code/pypa/twine/.tox/.tmp/package/2/twine-4.0.3.dev12+g4aba39c.tar.gz .pkg: _exit> python /Users/jaraco/code/pypa/twine/.tox/integration/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta pypiserver: OK (6.43 seconds) congratulations :) (6.48 seconds) ```

Edit: I'm not sure that shows what I think it shows. The fact that .tox/integration appears in the second execution isn't necessarily indicative of an issue. What I'm finding surprising, however, is that .tox/pypiserver doesn't exist after that's run.

jaraco commented 1 year ago

I upgraded my copy of virtualenv as found in my $PATH and cleaned the twine checkout and now I'm able to replicate the failures in both devpi and pypiserver.

Then, I ran tox -e devpi,pypiserver --notest, followed by tox -e integration and the tests pass.

jaraco commented 1 year ago

I am able to work around the issue by pinning to tox<4:

 twine main $ git diff
diff --git a/tox.ini b/tox.ini
index e526324..d41a594 100644
--- a/tox.ini
+++ b/tox.ini
@@ -26,6 +26,7 @@ deps =
     pytest-rerunfailures
     pytest-services
     build
+    tox<4
 passenv =
     PYTEST_ADDOPTS
 commands =

You have to pin in the integration environment, as that's the version of tox that gets used by jaraco.envs.ToxEnv.

jaraco commented 4 months ago

Since my last comment, I root-caused and Gabor fixed the issue in tox that was leading to failures with pypiserver. Moreover, while investigating recent failures as part of #1120, I noticed that the failures tend to be 503 Service Unavailable from test.pypi.org (https://github.com/pypa/twine/actions/runs/9294853403). So it seems as if the flaky tests are due primarily to a remote service that's under-provisioned for our use-case.

I'm reluctant to switch to production pypi. I don't want to be polluting that repository with lots of artifacts from tests. I also don't want to expose a token for a project in pypi to the world.

Maybe we should just drop the integration test for PyPI and rely on twine's own self-upload during any release to catch any regressions. It's the other servers that are less likely to have visibility into regressions.

Maybe the best course of action is to simply mark these PyPI tests as xfail until there's a better solution.