nedbat / coveragepy

The code coverage tool for Python
https://coverage.readthedocs.io
Apache License 2.0
3.02k stars 434 forks source link

Data files missing or empty when using tox --parallel auto #1642

Open avylove opened 1 year ago

avylove commented 1 year ago

Describe the bug When running with tox --parallel auto, files are missing or empty

To Reproduce

Environment: Python 3.11.2 on Fedora 38, Tox 4.6.1 Tox is installing the latest coverage, 7.2.7

  1. git clone https://github.com/jquast/blessed.git
  2. cd blessed
  3. tox --parallel auto -e py37,py38,py39,py310,py311,py312 -- --quiet

Pytest configuration (in tox.ini):

[pytest]
addopts =
    --color=yes
    --cov
    --cov-append
    --cov-report=xml
    --disable-pytest-warnings
    --ignore=setup.py
    --ignore=.tox
    --junit-xml=.tox/results.{envname}.xml
filterwarnings = error
junit_family = xunit1
log_format=%(levelname)s %(relativeCreated)2.2f %(filename)s:%(lineno)d %(message)s
norecursedirs = .git .tox build

I'll get one of these two errors for several of the Tox environments

INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/_pytest/main.py", line 269, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/_pytest/main.py", line 323, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pluggy/_hooks.py", line 265, in __call__
INTERNALERROR>     return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pluggy/_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pluggy/_callers.py", line 55, in _multicall
INTERNALERROR>     gen.send(outcome)
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pytest_cov/plugin.py", line 298, in pytest_runtestloop
INTERNALERROR>     self.cov_controller.finish()
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pytest_cov/engine.py", line 44, in ensure_topdir_wrapper
INTERNALERROR>     return meth(self, *args, **kwargs)
INTERNALERROR>   File "blessed/.tox/py310/lib/python3.10/site-packages/pytest_cov/engine.py", line 254, in finish
INTERNALERROR>     self.cov.combine()
INTERNALERROR>   File "blessed/.tox/py310/lib64/python3.10/site-packages/coverage/control.py", line 809, in combine
INTERNALERROR>     combine_parallel_data(
INTERNALERROR>   File "blessed/.tox/py310/lib64/python3.10/site-packages/coverage/data.py", line 148, in combine_parallel_data
INTERNALERROR>     with open(f, "rb") as fobj:
INTERNALERROR> FileNotFoundError: [Errno 2] No such file or directory: 'blessed/.coverage.localhost.276645.761168'
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/sqldata.py", line 1173, in _execute
INTERNALERROR>     return self.con.execute(sql, parameters)    # type: ignore[arg-type]
INTERNALERROR> sqlite3.OperationalError: no such table: file
INTERNALERROR> 
INTERNALERROR> During handling of the above exception, another exception occurred:
INTERNALERROR> 
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/sqldata.py", line 1178, in _execute
INTERNALERROR>     return self.con.execute(sql, parameters)    # type: ignore[arg-type]
INTERNALERROR> sqlite3.OperationalError: no such table: file
INTERNALERROR> 
INTERNALERROR> The above exception was the direct cause of the following exception:
INTERNALERROR> 
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/_pytest/main.py", line 269, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/_pytest/main.py", line 323, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pluggy/_hooks.py", line 265, in __call__
INTERNALERROR>     return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pluggy/_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pluggy/_callers.py", line 55, in _multicall
INTERNALERROR>     gen.send(outcome)
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pytest_cov/plugin.py", line 298, in pytest_runtestloop
INTERNALERROR>     self.cov_controller.finish()
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pytest_cov/engine.py", line 44, in ensure_topdir_wrapper
INTERNALERROR>     return meth(self, *args, **kwargs)
INTERNALERROR>   File "blessed/.tox/py38/lib/python3.8/site-packages/pytest_cov/engine.py", line 249, in finish
INTERNALERROR>     self.cov.stop()
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/control.py", line 809, in combine
INTERNALERROR>     combine_parallel_data(
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/data.py", line 171, in combine_parallel_data
INTERNALERROR>     data.update(new_data, aliases=aliases)
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/sqldata.py", line 670, in update
INTERNALERROR>     with con.execute("select path from file") as cur:
INTERNALERROR>   File "/usr/lib64/python3.8/contextlib.py", line 113, in __enter__
INTERNALERROR>     return next(self.gen)
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/sqldata.py", line 1207, in execute
INTERNALERROR>     cur = self._execute(sql, parameters)
INTERNALERROR>   File "blessed/.tox/py38/lib64/python3.8/site-packages/coverage/sqldata.py", line 1195, in _execute
INTERNALERROR>     raise DataError(f"Couldn't use data file {self.filename!r}: {msg}") from exc
INTERNALERROR> coverage.exceptions.DataError: Couldn't use data file 'blessed/.coverage.localhost.277041.519776': no such table: file

Expected behavior No errors

Additional context These tests spawn a lot of subprocesses, hundreds, if not thousands. I never have any issues running a single environment, but as soon as there are two, this issue is likely to surface. Seems like it could be a syncing issue.

nedbat commented 1 year ago

This sounds like a duplicate of #1514.

nedbat commented 1 year ago

I fixed this in your repo with this patch:

diff --git a/tox.ini b/tox.ini
index 5620c78..ade7c05 100644
--- a/tox.ini
+++ b/tox.ini
@@ -218,6 +218,7 @@ commands =

 [coverage:run]
 branch = True
+data_file = .coverage-${TOX_ENV_NAME}
 parallel = True
 source =
     blessed
avylove commented 1 year ago

Thanks, @nedbat!

It looks like a bunch of files get created in the .coverage-{TOX_ENV_NAME}.{HOSTNAME}.{PID}.{RANDOM} format, but they get consolidated into the .coverage-{TOX_ENV_NAME} files. So I guess the problem was these consolidated files were being consumed as pid-level files by other coverage processes in other Tox environments?

Since the pid-level files seem to be treated differently than the consolidated files, I wonder if it would be better if they had a different naming convention than the default data_file name?

I'm seeing one other issue after setting the data_file. coverage erase won't remove any of the data files unless I specify --data-file. And then, even if I use a wildcard, it only removes one.

nedbat commented 1 year ago

I changed the line to:

data_file = .coverage.${TOX_ENV_NAME}

(dot instead of dash). Then I could combine all the files together with:

coverage combine --data-file=.coverage

I'm not sure how you have been combining the files up to now, perhaps you didn't need to because they were all using the same name, but that might mean you have been losing information.

avylove commented 1 year ago

Ah, I see now. The data_file name is used as a globing pattern for combine and erase. I'm sorry, it seems very obvious now.

I'm guessing you're right and we were missing coverage data locally, however we upload coverage files separately to codecov in CI and that doesn't seem to be affected.

Thanks for your help on this. I'm trying to think of a good way to address issues like this for others. Perhaps something in the docs for this use case and maybe a hint when the exception is raised. Something like:

This error is potentially caused by one of the following:
- The file was deleted by an external process
- Multiple instances of coverage are running in parallel using the same data_file name, see https://coverage.readthedocs.io/en/stable/parallel.html
- Aliens have abducted the file. They believe it to be ancient text describing the origin on the universe.
nedbat commented 1 year ago

See also #1514.