automating the subprocess measurement setup

nedbat commented 9 years ago

Originally reported by Tibor (Bitbucket: tibor_arpas, GitHub: Unknown)

Subprocesses in test suits are a common occurence. coverage.py has an execellent way to measure their coverage but the set-up instructions are quite scary: http://nedbatchelder.com/code/coverage/subprocess.html

Also in the time of tox, continuous integration, virtualizations and containerizations, the manual process is quite beside the point.

What could we do to completely automate the described process?

There is an article by Ned http://nedbatchelder.com/blog/201001/running_code_at_python_startup.html

and a suggestion from @geoffbache

#!python
# sitecustomize.py
import os, sys
currDir = os.path.dirname(__file__)
sys.path.remove(currDir)
del sys.modules["sitecustomize"]
try:
    import sitecustomize
finally:
    sys.path.insert(0, currDir)

That script fails on my machine (MacOS, Python 2.7) with

#!stacktrace

Traceback (most recent call last):
  File "/Users/tibor/.virtualenvs/tmon/bin/../lib/python2.7/site.py", line 703, in <module>
    main()
  File "/Users/tibor/.virtualenvs/tmon/bin/../lib/python2.7/site.py", line 694, in main
    execsitecustomize()
  File "/Users/tibor/.virtualenvs/tmon/bin/../lib/python2.7/site.py", line 548, in execsitecustomize
    import sitecustomize
  File "/Users/tibor/tmonworkspace/testmon/sitecustomize.py", line 8, in <module>
    sys.path.insert(0, currDir)
AttributeError: 'NoneType' object has no attribute 'path'

But I would hope that it's the best approach.

Bitbucket: https://bitbucket.org/ned/coveragepy/issue/367

nedbat commented 9 years ago

Original comment by Buck Evan (Bitbucket: bukzor, GitHub: bukzor)

The cov-core solution doesn't uninstall whatsoever. In fact it causes the whole virtualenv to explode (ImportError) if you uninstall it.

The coverage-pth solution is better, but doesn't handle setup.py develop aka pip install -e.

The solution found in python-hunter handles all those cases, in addition to supporting easy-install. https://github.com/ionelmc/python-hunter/blob/master/setup.py

We've recently adapted this into a package that fills the same problem space as coverage-pth, but is more reliable.

IMO this should be a default feature of coveragepy. While it will add one if-statement to all subsequent python startup time, that seems like a near-zero cost to me, and the additional use-cases supported are sizeable.

nedbat commented 9 years ago

Original comment by Tibor (Bitbucket: tibor_arpas, GitHub: Unknown)

nedbat commented 9 years ago

Original comment by Tibor (Bitbucket: tibor_arpas, GitHub: Unknown)

The problem I had is only with --editable coveragepy. So yes, it's definitely a solution! Could you include it in coveragepy itself?

nedbat commented 9 years ago

Original comment by Tibor (Bitbucket: tibor_arpas, GitHub: Unknown)

Nice approach, I guess. I'm still getting ImportError: No module named 'coverage' on setup.py of coverage (tox/pip somehow always installs the pth file first and only then attempts to run python setup.py of coverage, which fails because there is no coverage.py yet)

nedbat commented 9 years ago

Is this a solution? https://github.com/dougn/coverage_pth

glyph commented 6 years ago

I've been using https://pypi.org/project/coverage_enable_subprocess/ recently and it works very nicely.

glyph commented 6 years ago

The set-up instructions are also somewhat meandering. Should I try my hand at fixing the docs? It's really just 4 steps:

Write an explicit .coveragerc
Include parallel = true in it
Set COVERAGE_PROCESS_START to its absolute path (probably {toxinidir}/.coveragerc
ensure it's imported at process startup (by installing the aforementioned https://pypi.org/project/coverage_enable_subprocess/ )

In my humble opinion the way to fix this to be simpler would then be to remove non-parallel mode from coverage entirely, and have it take care of steps 1 (possibly putting coveragerc defaults in a temp dir) 3 (os.environ[...] = ...), and 4 (just ship the .pth file as part of coverage.py) automatically for you.

nedbat commented 6 years ago

@glyph tell me more about how "removing non-parallel mode entirely" would work? Wouldn't that mean that simple runs would always generate a numbered .coverage.xxx.yyy file?

glyph commented 6 years ago

@glyph tell me more about how "removing non-parallel mode entirely" would work? Wouldn't that mean that simple runs would always generate a numbered .coverage.xxx.yyy file?

That's what I'm suggesting, yes. Is there any particular reason not to do this?

nedbat commented 6 years ago

I don't understand what would happen to "coverage combine". Today, "coverage report" (or html, xml, etc), only report on data in the .coverage data file. Parallel files have to be combined first in order to be reported.

If we say everything is parallel from now on, does that mean that everyone has to run "coverage combine" before reporting? Or do the reporting commands all implicitly combine first? Running coverage twice in a row would give two distinct data files, then implicit combining would combine them together? That seems wrong.

glyph commented 6 years ago

Running coverage twice in a row would give two distinct data files, then implicit combining would combine them together? That seems wrong.

It's quite possible that my perspective is badly skewed by the fact that this is already the world I live in. I don't have any projects with only a single test environment, and once you have 2, you want to be able to speed up runs with tox -p auto, so parallel coverage is de rigeur even if your test suite itself has zero parallelism.

But let me take it as a given that coverage run thing1; coverage run thing2; coverage report reporting only on thing2 is desirable behavior. How do you get this with parallel coverage too? One strategy that occurs to me is that we add a discrete notion of a "coverage group", and give it its own identity. A coverage group would be a UUID with an optional descriptive label, and a timestamp. (I confess that "group" might be a terrible name for this concept, but, just to have something to call it...)

$ coverage group new --label "lol"
2e843587-a60c-4fab-bfaf-dab910437bb0
$ coverage run --group 2e843587-a60c-4fab-bfaf-dab910437bb0
$ coverage report --group 2e843587-a60c-4fab-bfaf-dab910437bb0
report for 'lol' ...

Without --group, coverage run implicitly creates a new group, and coverage report reports on the group with the latest creation timestamp.

You could then have coverage group --list to give you a nicely presented list of timestamps, IDs, and labels. (Groups would presumably be files with a naming convention, like .covgroup.$UUID with the timstamp and label in a very simple format inside them.)

To run a test suite that might run some subprocesses, you could then do, say:

$ coverage spawn tox -p auto
| / - \ | etc

Like run, this implicitly makes a new group, then runs tox -p auto as a subprocess with the coverage group ID as part of its environment.

$ coverage report

This reports on the group created by spawn, since it is the group with the most recent timestamp in the current directory.

Another advantage of this approach is that you could run multiple tests and report on their combined coverage at the same time in the same project, and you wouldn't need any special locking; .coverage files could be named .coverage.$GROUP.$HOST.$PID and commands like report could combine just the relevant files.

meejah commented 5 years ago

I don't understand what would happen to "coverage combine". Today, "coverage report" (or html, xml, etc), only report on data in the .coverage data file. Parallel files have to be combined first in order to be reported.

Further to what @glyph suggests, what if "coverage combine" (and/or just "read all the .coverage-* files) was the default, so "coverage report" would always be on "all the coverage data right here". That is, both "not-parallel mode" and "coverage combine" go away.

(I might also be biased, because the only time I don't do coverage combine is when I forget). I agree with the other comment that it would be better to have an explicit way to name "chunks of coverage" (if that's a feature people use).

omry commented 4 years ago

Any thought about inverting things a bit? Instead of each process writing a file, the parent process can start a local server. each sub-process can send the coverage data to it (discovery can be made based on environment variable set by the parent process), and it will be able to dump a single unified file.

This does not address the "how do we get the subprocess to do things" problem, but does address the difficulty introduced by having two models that are leaving different dropped files.

thoughts?

By the way: there is an issue to remove .pth support, you might want to chime in.

wsanchez commented 3 years ago

@glyph I think you can achieve your coverage groups by setting COVERAGE_FILE to .../coverage.groupname and user parallel mode.

What I just did in Klein's Tox config is to set COVERAGE_FILE={toxworkdir}/coverage.{envname}, which (with parallel mode turned on) generate will generate a bunch of {toxworkdir}/coverage.{envname}.XXXX files, that I combine after the test run. I can then set COVERAGE_FILE={toxworkdir}/coverage later to get a combined coverage file for all test runs.

nedbat commented 1 year ago

This is related (duplicate) to #378.

nedbat commented 1 year ago

Now I think the best approach is to not install a .pth file when coverage.py is installed at all, but instead, to create the .pth file when coverage measurement is started, and remove the file when it ends.

My theory is we only want to measure subprocesses when some "main" process (pytest?) is under coverage measurement itself. So when the main process begins coverage, it can configure the mechanism for subprocess measurement.

Is there some reason that wouldn't work?

glyph commented 1 year ago

Now I think the best approach is to not install a .pth file when coverage.py is installed at all, but instead, to create the .pth file when coverage measurement is started, and remove the file when it ends.

This is a really obscure and nuanced point about packaging, but in general test suites should not assume that they can write to a sitedir. We have bumped in to this problem a few times with Twisted, where sometimes you're testing an installed / built artifact, and not a virtualenv, but still want to have coverage measurement.

For example: let's say you're building a mac app (my favorite weird-build-use-case, everything's always broken here). It's broken in production on users' machines but working in test. You want to ship an instrumented build and collect coverage information. The app bundle will be installed to /Applications with an administrator password and should not be writable by the user or its code-signing will break. So the .pth file cannot be written in this context.

There are equivalent situations when shipping e.g. a debian package or even a docker container which installs as root but runs tests as a restricted user, but they're a lot harder to describe and there are squirrely workarounds that might partially fix the problem, so this one works as a good example of a place that there's a hard limit on such a thing.

nedbat commented 1 year ago

in general test suites should not assume that they can write to a sitedir.

I hear you, and I understand about the variety of environments. I assume the code will have to handle the case of not being able to find a writable directory in sys.path, and will simply have to issue a warning about that. It's got to be an improvment over what we have now, which is a page explaining how to make your own .pth files, right?

glyph commented 1 year ago

I hear you, and I understand about the variety of environments. I assume the code will have to handle the case of not being able to find a writable directory in sys.path,

Note that .pth files are not processed on sys.path, but on the specific list of paths passed to site.addsitedir which includes site-packages and user-site but notably not anything on PYTHONPATH, so you need to be careful and selective about which specific paths you use. Ideally you want to detect whether user site packages are enabled, whether the specific interpreter context you're executing in will have them enabled or disabled in subprocesses, whether you're in a virtualenv or not… there's a ton of logic here just to guess what might be a good spot.

and will simply have to issue a warning about that. It's got to be an improvment over what we have now, which is a page explaining how to make your own .pth files, right?

Sure, this would definitely be an improvement, but given this disadvantage, I don't quite understand what the benefit of adding the complexity of dynamically writing something at runtime rather than just shipping the .pth file at install time? The install-time pth file can still have a conditional in it so that nothing gets loaded if not necessary.

Another issue you'll have to grapple with is the fact that overlapping coverage runs (which, previously, would have been able to coexist to some extent with -p) will now be installing .pth files into shared directories, potentially executing the relevant code more times than were expected.

meejah commented 1 year ago

I don't know how to express this as a concrete proposal to move forward, but a struggle I've often had with subprocess coverage is that one ends up "having" to implement custom code in the application to facilitate coverage (whether that's a command-line option or environment-variables, etc). The struggle here is that if an application produces subprocesses but doesn't have specialized knowledge of coverage.py, some of those subprocesses won't ever get coverage.

So if you're going to put custom "coverage" code into an application, it might as well use coverage.* APIs directly (rather than also learn the right environment / or atexit etc dances to do).

That is, applications are going to end up with with a --coverage option (or similar) in their subprocesses. So, a simple and reliable Python API is maybe all that's required here. (There's also some issues with atexit in general that perhaps could benefit here too).

glyph commented 1 year ago

@meejah a .pth file contains code that every Python process executes at startup, so the existing solution works fine without requiring applications to do this; I think the discussion here is mostly about how to execute that, rather than about whether to give up on it?

nedbat commented 1 year ago

The conditional in the .pth file (actually in a function it calls) is about an environment variable. There's no guarantee I can set an environment variable and have it carried over into subprocesses. So every choice has a chance of failure, and needs to try to determine if it failed. Which approach has the higher likelihood of succeeding?

Is it weird to try a combined approach? Write a .pth at install time, and at run time, update the .pth with specific details (where the rc file is, and whether to measure).

meejah commented 1 year ago

The conditional in the .pth file (actually in a function it calls) is about an environment variable.

To @glyph's point as well, the above is still a problem: now I have to arrange for an environment-variable to be passed on to the subprocess, too. If it doesn't work, did I do that wrong? Did the .pth code not run? Something else?

So I think what I'm saying is: the only places I've gotten subprocess-using applications to produce coverage properly is when I've put custom code in them. So just give me reliable "start coverage now" and "end coverage now" APIs, and I'll put those in my applications (and then hopefully I don't have to debug .pth files, environment-variables and atexit minutiae all at once).

glyph commented 1 year ago

The conditional in the .pth file (actually in a function it calls) is about an environment variable. There's no guarantee I can set an environment variable and have it carried over into subprocesses. So every choice has a chance of failure, and needs to try to determine if it failed. Which approach has the higher likelihood of succeeding?

Personally, my experience is that environment variables are more likely to succeed. In principle I guessed that this might cause issues, but in practice after using it on many projects over many years, it's mostly been set-it-and-forget-it.

Allowlist environment filtering does happen and is unpleasant to debug, but also, allowlist filtering is basically always a bug and it's easy to recommend that tools get rid of it. Like, did you remember to inherit TMPDIR, XPC_FLAGS, XPC_SERVICE_NAME, LaunchInstanceID, and __CFBundleIdentifier on macOS, just in case one of the platform libraries that libc talks to might need one of those for some reason? Whereas "making your site-packages read-only" is not a bug, and probably in fact a good operational practice.

Is it weird to try a combined approach? Write a .pth at install time, and at run time, update the .pth with specific details (where the rc file is, and whether to measure).

Given the possibility of multiple possible parallel coverage runs, I don't think there's any value to updating the .pth file when the .pth file can easily check other filesystem locations, if you want to avoid env vars as a communication channel. You could, for example, have a parent process write to a path in f"~/.python-coverage/{os.getpid()}" and then have child processes use os.getppid() to determine where to read from.

My estimation is that this is an unnecessary fallback, and maybe not the best one; env vars should work well enough that I don't think this would be required. But as much as I know about weird deployment environments, coverage obviously has a distribution footprint way bigger than anything I've worked on, so I could see that everything has to be triple-redundant and an API like this might be a way to accomplish that.

glyph commented 1 year ago

So I think what I'm saying is: the only places I've gotten subprocess-using applications to produce coverage properly is when I've put custom code in them. So just give me reliable "start coverage now" and "end coverage now" APIs, and I'll put those in my applications (and then hopefully I don't have to debug .pth files, environment-variables and atexit minutiae all at once).

These APIs already exist. Are you just saying that the status quo is better (less fragile, etc) than any of these solutions?

meejah commented 1 year ago

@glyph, yeah I think that is what I'm saying.

To be clear, that means the "good advice" for programs wanting subprocess coverage is to include some custom code in them (triggered however you like) that does cov = coverage.Coverage() ; cov.start() on startup and arranges for cov.stop() ; cov.save() to happen on shutdown.

nedbat commented 1 year ago

I appreciate all of the advice, and the depth of (scarred) experience that informs these comments. I wish there were something to do though, when I get issues like #1341.

glyph commented 1 year ago

@meejah

yeah I think that is what I'm saying.

Respectfully, I disagree. Any one of the solutions described here will still occasionally lead to a bad outcome where you'll need to debug something weird, but like 99.9% of the time they'll work just fine. In the remaining .1% of cases presumably the explicit API will remain.

nedbat / coveragepy

automating the subprocess measurement setup #367