nedbat / coveragepy

The code coverage tool for Python
https://coverage.readthedocs.io
Apache License 2.0
3k stars 428 forks source link

`assert row is not None` triggered in `CoverageData._read_db()` when checking schema version #1522

Closed altendky closed 1 year ago

altendky commented 1 year ago

Describe the bug

In some of our jobs with 7.0.2, and not with 7.0.1, we commonly, but not always, get this traceback at the end. I expect it is indicative of another issue related to pytest-cov and pytest-xdist... but, none the less, it is a change in a patch release of coverage that may interest you.

https://github.com/Chia-Network/chia-blockchain/actions/runs/3825220672/jobs/6508036405#step:16:558

INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/_pytest/main.py", line 270, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/_pytest/main.py", line 324, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pluggy/_hooks.py", line 265, in __call__
INTERNALERROR>     return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pluggy/_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pluggy/_callers.py", line 55, in _multicall
INTERNALERROR>     gen.send(outcome)
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pytest_cov/plugin.py", line 297, in pytest_runtestloop
INTERNALERROR>     self.cov_controller.finish()
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pytest_cov/engine.py", line 44, in ensure_topdir_wrapper
INTERNALERROR>     return meth(self, *args, **kwargs)
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/pytest_cov/engine.py", line 338, in finish
INTERNALERROR>     self.cov.stop()
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/control.py", line 797, in combine
INTERNALERROR>     message=self._message,
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/data.py", line 158, in combine_parallel_data
INTERNALERROR>     new_data.read()
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/sqldata.py", line 815, in read
INTERNALERROR>     with self._connect():
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/sqldata.py", line 342, in _connect
INTERNALERROR>     self._open_db()
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/sqldata.py", line 285, in _open_db
INTERNALERROR>     self._read_db()
INTERNALERROR>   File "/home/runner/work/chia-blockchain/chia-blockchain/venv/lib/python3.7/site-packages/coverage/sqldata.py", line 302, in _read_db
INTERNALERROR>     assert row is not None
INTERNALERROR> AssertionError

To Reproduce How can we reproduce the problem? Please be specific. Don't link to a failing CI job. Answer the questions below:

  1. What version of Python are you using?
    • Python 3.7 through 3.10 on all of Linux, macOS, and Windows
  2. What version of coverage.py shows the problem? The output of coverage debug sys is helpful.
    • 7.0.2
  3. What versions of what packages do you have installed? The output of pip freeze is helpful.
    • Successfully installed Jinja2-3.1.2 MarkupSafe-2.1.1 PyYAML-6.0 Pygments-2.14.0 SecretStorage-3.3.3 aiofiles-22.1.0 aiohttp-3.8.3 aiohttp-cors-0.7.0 aiosignal-1.3.1 aiosqlite-0.17.0 altgraph-0.17.3 anyio-3.6.2 argon2-cffi-bindings-21.2.0 argon2_cffi-21.3.0 astroid-2.12.13 asttokens-2.2.1 async-timeout-4.0.2 attrs-22.2.0 backcall-0.2.0 bencode.py-4.0.0 bitstring-3.1.9 black-22.10.0 bleach-5.0.1 blspy-1.0.16 build-0.9.0 certifi-2022.12.7 cffi-1.15.1 cfgv-3.3.1 chardet-5.1.0 charset-normalizer-2.1.1 chia-blockchain-1.7.0b5.dev18 chia-rs-0.1.16 chiabip158-1.1 chiapos-1.0.11 chiavdf-1.0.8 click-8.1.3 clvm-0.9.7 clvm-tools-0.4.6 clvm-tools-rs-0.1.25 colorama-0.4.5 colorlog-6.7.0 commonmark-0.9.1 concurrent-log-handler-0.9.20 coverage-7.0.2 cryptography-38.0.3 decorator-5.1.1 diff-cover-7.3.0 dill-0.3.6 distlib-0.3.6 dnslib-0.9.23 dnspython-2.2.1 docutils-0.19 exceptiongroup-1.1.0 execnet-1.9.0 executing-1.2.0 filelock-3.8.0 flake8-6.0.0 frozenlist-1.3.3 identify-2.5.11 idna-3.4 importlib-metadata-6.0.0 iniconfig-1.1.1 ipython-8.7.0 isort-5.11.4 jaraco.classes-3.2.3 jedi-0.18.2 jeepney-0.8.0 keyring-23.9.3 keyrings.cryptfile-1.3.9 lazy-object-proxy-1.8.0 matplotlib-inline-0.1.6 mccabe-0.7.0 memory-profiler-0.61.0 more-itertools-9.0.0 multidict-6.0.4 mypy-0.991 mypy-extensions-0.4.3 nodeenv-1.7.0 packaging-21.3 parso-0.8.3 pathspec-0.10.3 pep517-0.13.0 pexpect-4.8.0 pickleshare-0.7.5 pkginfo-1.9.2 platformdirs-2.6.2 pluggy-1.0.0 portalocker-2.6.0 pre-commit-2.21.0 prompt-toolkit-3.0.36 psutil-5.9.1 ptyprocess-0.7.0 pure-eval-0.2.2 py3createtorrent-1.1.0 pycodestyle-2.10.0 pycparser-2.21 pycryptodome-3.16.0 pyflakes-3.0.1 pyinstaller-5.6.2 pyinstaller-hooks-contrib-2022.14 pylint-2.15.9 pyparsing-3.0.9 pytest-7.2.0 pytest-asyncio-0.20.3 pytest-cov-4.0.0 pytest-monitor-1.6.5 pytest-xdist-3.1.0 readme-renderer-37.3 requests-2.28.1 requests-toolbelt-0.10.1 rfc3986-2.0.0 rich-13.0.0 setproctitle-1.2.3 six-1.16.0 sniffio-1.3.0 sortedcontainers-2.4.0 stack-data-0.6.2 tomli-2.0.1 tomlkit-0.11.6 traitlets-5.8.0 twine-4.0.2 types-aiofiles-22.1.0.4 types-cryptography-3.3.23.2 types-pkg-resources-0.1.3 types-pyyaml-6.0.12.2 types-setuptools-65.6.0.2 typing-extensions-4.4.0 urllib3-1.26.13 virtualenv-20.17.1 watchdog-2.1.9 wcwidth-0.2.5 webencodings-0.5.1 wrapt-1.14.1 yarl-1.8.2 zipp-3.11.0 zstd-1.5.2.6
  4. What code shows the problem? Give us a specific commit of a specific repo that we can check out. If you've already worked around the problem, please provide a commit before that fix.
  5. What commands did you run?

Expected behavior

No tracebacks as is the case with 7.0.1. Unless... it is deemed that this is an indication of a "real" problem as I expressed concern about above.

Additional context

https://github.com/Chia-Network/chia-blockchain/pull/14255 is where I am reverting to 7.0.1 for now and includes links to a couple other PRs and CI runs where I setup our CI to run the problematic jobs 60 times over to provide a clear "statistical" comparison of the zero failures with 7.0.1 vs., well, many with 7.0.2.

I looked at the refactor of CoverageData._read_db() at https://github.com/nedbat/coveragepy/compare/7.0.1..7.0.2#diff-e5185ca855df54af4f94c983f99b52cfad9ddc17386353d09acdfa994dd72a33R287 (you have to expand the Load diff message for coverage/sqldata.py since GitHub doesn't do it for you) and noticed that while this exception would have previously been caught by the except Exception as exc: clause the unpacking has been moved such that it is no longer mis-detected as a missing table.

Side note, this gave me a good chuckle.

https://github.com/nedbat/coveragepy/blob/7.0.2/CHANGES.rst#version-702--2023-01-02

Refactor: added many type annotations, including a number of refactorings. This should not affect outward behavior, but they were a bit invasive in some places, so keep your eyes peeled for oddities.

HansBug commented 1 year ago

The exact same error occurred here. It first appeared 15 hours ago. We used coverage-7.0.1 before, and this error appeared after updating to coverage-7.0.2.

Here are the full logs:

nedbat commented 1 year ago

Hmm, this is definitely a part of code changed between 7.0.1 and 7.0.2, but reading the 7.0.1 code, I would have expected it to raise a coverage.DataError: "Data file {!r} doesn't seem to be a coverage data file: {exc}". That error message has been mentioned a few times, but doesn't seem to be a current problem. I'm trying to understand how 7.0.1 manifested this race condition...

nedbat commented 1 year ago

I see: when combining data files, if a file can't be read, we silently skip the file and leave it behind. But that was only a check for CoverageException. 7.0.2 changed this failure case from a DataError (derived from CoverageException) to an AssertionError.

nedbat commented 1 year ago

I've made a small tweak on a branch, can you try it? It should restore the old behavior, but with more information shown when a file can't be combined:

pip install git+https://github.com/nedbat/coveragepy@nedbat/fix-1522#egg=coverage==0.0
altendky commented 1 year ago

Sure thing. I referenced the branch in https://github.com/Chia-Network/chia-blockchain/pull/14261/ with the workflow run at https://github.com/Chia-Network/chia-blockchain/actions/runs/3831281612. Though, with it fixed I'm not sure that we'll see anything since this was happening via pytest-cov. All the completed runs so far have passed, so it does seem to be back to previous behavior. To get some data, should we make it fail again explicitly and dump the info you want to a file and I can capture that in a CI artifact?

FWIW, every time I deal with pytest-cov I do think about getting rid of it. Hasn't bubbled to the top of my list yet though. :|

nedbat commented 1 year ago

You have a complex action, so I'm having trouble seeing the difference. Where it had failed with an assert, we should now see a message about not being able to combine a data file. Is that happening?

altendky commented 1 year ago

It was happening inside the pytest run so I'm guessing that if your diagnostics are going to stdout that pytest is likely consuming them.

altendky commented 1 year ago

I can certainly just take it on myself to tweak your branch and get that info visible someway too, if you would like.

I'm curious about when the db is created and schema info written when using pytest-cov and pytest-xdist. Is that somehow happening at the end of the xdist worker lifetime and the main process is attempting to collect those DB files too soon?

jamie1911 commented 1 year ago

+1

I am now also getting this error on 7.0.2

/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/data.py:163: CoverageWarning: Couldn't use data file '/builds/company/team/project-abc/.coverage.runner-npuy5rve-project-32085024-concurrent-3.3930.994265': database disk image is malformed
  data._warn(str(exc))
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/_pytest/main.py", line 270, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/_pytest/main.py", line 324, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 265, in __call__
INTERNALERROR>     return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pluggy/_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pluggy/_callers.py", line 55, in _multicall
INTERNALERROR>     gen.send(outcome)
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pytest_cov/plugin.py", line 297, in pytest_runtestloop
INTERNALERROR>     self.cov_controller.finish()
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pytest_cov/engine.py", line 44, in ensure_topdir_wrapper
INTERNALERROR>     return meth(self, *args, **kwargs)
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/pytest_cov/engine.py", line 247, in finish
INTERNALERROR>     self.cov.combine()
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/control.py", line 791, in combine
INTERNALERROR>     combine_parallel_data(
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/data.py", line 158, in combine_parallel_data
INTERNALERROR>     new_data.read()
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/sqldata.py", line 815, in read
INTERNALERROR>     with self._connect():
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/sqldata.py", line 342, in _connect
INTERNALERROR>     self._open_db()
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/sqldata.py", line 285, in _open_db
INTERNALERROR>     self._read_db()
INTERNALERROR>   File "/builds/company/team/project-abc/venv/lib/python3.10/site-packages/coverage/sqldata.py", line 302, in _read_db
INTERNALERROR>     assert row is not None
INTERNALERROR> AssertionError
johnthagen commented 1 year ago

We started hitting this as well with 7.0.2. Our test suite uses pytest-cov and pytest-xdist and some of the tests internally are testing code that uses multiprocessing.Pool. The code is private.

Environment:

coverage                       7.0.2
pytest                         7.2.0
pytest-cov                     4.0.0
pytest-mock                    3.10.0
pytest-randomly                3.12.0
pytest-xdist                   3.1.0

Python 3.10.9, running in the python:3.10 DockerHub image on a GitLab runner docker executor.

I don't know if this is related or not, but under 7.0.1 and earlier we would see the following warnings in the CI logs, but they never caused an issue so we were ignoring them:

.nox/test-3-10/lib/python3.10/site-packages/coverage/data.py:151: CoverageWarning: Data file '.coverage.runner-b6rhuh6f-project-1440-concurrent-2.31582.244847' doesn't seem to be a coverage data file: cannot unpack non-iterable NoneType object
  data._warn(str(exc))
.nox/test-3-10/lib/python3.10/site-packages/coverage/data.py:151: CoverageWarning: Couldn't use data file '.coverage.runner-b6rhuh6f-project-1440-concurrent-2.39471.069373': database disk image is malformed
  data._warn(str(exc))
nedbat commented 1 year ago

@johnthagen can you try the branch I mentioned above?

pip install git+https://github.com/nedbat/coveragepy@nedbat/fix-1522#egg=coverage==0.0

Is anyone seeing this problem without pytest-cov?

altendky commented 1 year ago

I wouldn't be surprised if it were even specific to pytest-cov with pytest-xdist. I'll try to ~find~ ~make~ conjure up some time to dig in that direction.

nedbat commented 1 year ago

This is fixed in commit 1f34d8b0.

nedbat commented 1 year ago

This is now released as part of coverage 7.0.3.

altendky commented 1 year ago

Thanks for the quick turnaround!

But, am I understanding correctly that the underlying event here relates to probably losing some coverage data from some of the xdist processes? Or something similar. I appreciate that the symptom is gone, but it seems like there's an issue for me to explore and address elsewhere.

nedbat commented 1 year ago

I agree that there's a possible problem here. This fix gets us back to where 7.0.1 (and before) were.

I don't know if there is data loss, since no one has reported a problem like that. It might be that new empty data files are getting created when they shouldn't be.

johnthagen commented 1 year ago

@nedbat Wanted to confirm that 7.0.3 solved our issue and only the warnings listed here are shown, but coverage information is correct and the tests complete successfully: