Undeterministic Results by coveragepy

apatanwal-eightfold commented 1 year ago

Describe the bug We have recently introduced Coverage into our codebase as a means to measure test coverage and observed that that module intermittently misses recording few lines getting touched when run against our test suite. This happens sporadically and we could not identify the cause of this behavior. We believe we are using the tool properly and with the right concurrency inputs. Requesting any pointers on when would such a case happen and any clues on to debug such scenarios.

Following is a summary of how we are using Coverage in our code base:

We run pytest in our codebase by dividing directories among workers. So each worker runs pytest and calculates coverage for the directories assigned to it. There are times when coverage fails to hit lines in a module for which test is ran (we can see from the logs that the test which covers those line was ran). The above in scenario occurs for about 2% of the time.

How do we divide directories among workers? We use Pool from mulitprocessing module in python multiprocessing.dummy import Pool

So each thread in the pool will run pytest seperately using the subprocess module.

To Reproduce As noted above the problem is not reproducible, It happens intermittently.

Command used

  pytest --cov={cov_directory} --cov-report xml:coverage{worker_id}.xml  --cov-append {dir1} {dir2}..{dir3}
  where dir1 = {cov_directory}/dir1

We have the .coveragerc in {cov_directory}/

.coveragerc

  [run]

  omit = 
      *tests*
      *test_* 
      *_view.py 
      *__init__*

  concurrency = gevent

  [report]
  # Regexes for lines to exclude from consideration
  exclude_lines =
      # Don't complain if tests don't hit defensive assertion code:
      raise AssertionError
      raise NotImplementedError

      # Don't complain if non-runnable code isn't run:
      if __name__ == .__main__.:

      # We use main() function if running script
      def main..:

What version of Python are you using? -> Python 3.8.11

What version of coverage.py shows the problem?

-- sys -------------------------------------------------------
             coverage_version: 6.4.4
              coverage_module: /home/ec2-user/py3-virt/lib/python3.8/site-packages/coverage/__init__.py
                       tracer: -none-
                      CTracer: available
         plugins.file_tracers: -none-
          plugins.configurers: -none-
    plugins.context_switchers: -none-
            configs_attempted: .coveragerc
                 configs_read: /home/ec2-user/vscode/www/.coveragerc
                  config_file: /home/ec2-user/vscode/www/.coveragerc
              config_contents: b"[run]\n\nomit = \n    *tests*\n    *test_* \n    *_view.py \n    *__init__*\n\nconcurrency = gevent\n\n[report]\n# Regexes for lines to exclude from consideration\nexclude_lines =\n    # Don't complain if tests don't hit defensive assertion code:\n    raise AssertionError\n    raise NotImplementedError\n\n    # Don't complain if non-runnable code isn't run:\n    if __name__ == .__main__.:\n    \n    # We use main() function if running script\n    def main..:\n"
                    data_file: -none-
                       python: 3.8.11 (default, Oct 10 2021, 09:50:16) [GCC 7.5.0]
                     platform: Linux-5.4.0-1093-aws-x86_64-with-glibc2.27
               implementation: CPython
                   executable: /home/ec2-user/py3-virt/bin/python3.8
                 def_encoding: utf-8
                  fs_encoding: utf-8
                          pid: 23671
                          cwd: /home/ec2-user/vscode/www
                         path: /home/ec2-user/py3-virt/bin
                               /home/ec2-user/py3-virt/lib/python3.8/dist-packages
                               /home/ec2-user/py3-virt/lib/python3.8/site-packages
                               /home/ec2-user/vscode/www
                               /home/ec2-user/vscode/spark
                               /usr/local/lib/python38.zip
                               /usr/local/lib/python3.8
                               /usr/local/lib/python3.8/lib-dynload
                  environment: DB_USE_PYMYSQL = 1
                               HOME = /home/ec2-user
                               PYLINTRC = /home/ec2-user/vscode/.pylintrc
                               PYTHON3_ENV = 1
                               PYTHON3_ENV_VERSION = 3.8
                               PYTHONPATH = /home/ec2-user/py3-virt/lib/python3.8/dist-packages/:/home/ec2-user/py3-virt/lib/python3.8/site-packages/:.:/home/ec2-user/vscode/www:/home/ec2-user/vscode/spark
                               RUN_PYTESTS = 1
                 command_line: /home/ec2-user/py3-virt/bin/coverage debug sys
              sqlite3_version: 2.6.0
       sqlite3_sqlite_version: 3.22.0
           sqlite3_temp_store: 0
      sqlite3_compile_options: COMPILER=gcc-7.5.0, ENABLE_COLUMN_METADATA, ENABLE_DBSTAT_VTAB,
                               ENABLE_FTS3, ENABLE_FTS3_PARENTHESIS, ENABLE_FTS3_TOKENIZER, ENABLE_FTS4,
                               ENABLE_FTS5, ENABLE_JSON1, ENABLE_LOAD_EXTENSION, ENABLE_PREUPDATE_HOOK,
                               ENABLE_RTREE, ENABLE_SESSION, ENABLE_STMTVTAB, ENABLE_UNLOCK_NOTIFY,
                               ENABLE_UPDATE_DELETE_LIMIT, HAVE_ISNAN, LIKE_DOESNT_MATCH_BLOBS,
                               MAX_SCHEMA_RETRY=25, MAX_VARIABLE_NUMBER=250000, OMIT_LOOKASIDE,
                               SECURE_DELETE, SOUNDEX, TEMP_STORE=1, THREADSAFE=1

What versions of what packages do you have installed?

gevent==21.1.2
pytest==7.2.0
pytest-bdd==3.3.0
pytest-cov==2.9.0
pytest-expect==1.1.0
pytest-forked==1.4.0
pytest-html==1.22.1
pytest-json==0.4.0
pytest-json-report==1.4.1
pytest-metadata==2.0.4
pytest-mock==3.3.1
pytest-rerunfailures==10.2
pytest-sentry==0.1.9
pytest-timeout==2.1.0
pytest-xdist==1.31.0
coverage==6.4.4
coverage-enable-subprocess==1.0

Additional context Each thread in the pool is given a unique 'COVERAGE_FILE' environment variable so that the coverage binaries do not collide.

Also we are using --cov-append which should aggregate the results

nedbat commented 1 year ago

There's a lot of complexity here. There's no good answer to when coverage might skip lines. It could be that you are losing data when combining files? Other than that, I don't know what could be happening. If you can give me a way to run your test suite, I might be able to find out more.

apatanwal-eightfold commented 1 year ago

@nedbat Thanks for the reply. Please help me with these questions.

Doesn't --append flag takes care of not losing the data?
Is it possible to spew out some logs from coverage to understand what is happening behind the scenes?

I also performed a test in which i ran pytest over our codebase in the way or CI framework does and i could see that the result was different everytime.

Run    Total   Covered   %
1         371730  113343 70
2         371730  113097  70
3         371730   113120  70

Zeckie commented 1 year ago

Have you tried running coverage html for each run, and comparing the reports, to help drill down to the files / lines where the coverage differences are? Or have you looked into the coverage debug options?

apatanwal-eightfold commented 1 year ago

@Zeckie Thanks for the insights, I compared the results and looks like there are some blocks which are not getting hit, most probably because redis cache is not mocked in few tests. I tested the result with coveragepy using parallel mode and disabling cache and was able to get consistent result for few directories. Although using pytest-cov still gave inconsistencies.

Do we know why --append is disabled for parallel mode in coveragepy? Can it be the reason for inconsistency in pytest-cov given i am using --cov-append.

nedbat commented 1 year ago

If the problem only happens when using pytest-cov, you should report an error to them: https://github.com/pytest-dev/pytest-cov

nedbat / coveragepy

Undeterministic Results by coveragepy #1533