passed reruns do not cache correctly?

gotestyourself / gotestsum

'go test' runner with output optimized for humans, JUnit XML for CI integration, and a summary of the test results.

Apache License 2.0

2.03k stars 119 forks source link

passed reruns do not cache correctly? #351

Open nfi-hashicorp opened 1 year ago

nfi-hashicorp commented 1 year ago

This is kind of surprising behavior, unless you know details about how Go does test caching.

When using gotestsum --rerun-fails, if the first run fails but the package ultimately passes, that result doesn't really get cached.

This is because go test doesn't treat -run specially. It doesn't cache the result at a function level at all. I did more investigation and a repro here: https://github.com/nfi-hashicorp/gotestsum-rerun-caching

I'm not sure there's anything to be done about this?

dnephin commented 1 year ago

Ya, that is not surprising to me given the description of the test cache (https://pkg.go.dev/cmd/go/internal/test, specifically the paragraph that descries "list mode", and the following one that explains "go test caches successful package test results"). The -run flag is included in the cache key. It does sound like you could cache individual tests with -run, but then you'd have to run them all individually the next time. The cost of running them individually probably negates the benefit of caching in most cases, but it might be worth trying.

I'm assuming you're working on this for Consul. From my experience Consul would benefit significantly from splitting its two largest packages (agent/consul, and agent if I remember correctly) into smaller packages. That would allow more tests to run in parallel and also make better use of the package cache.

dnephin commented 1 year ago

This bit is interesting as well

Tests that [...] consult environment variables only match future runs in which the files and environment variables are unchanged.

In CI it's unlikely the test cache will be useful by default, because the environment variables set by the CI system will be different on each run (ex: build number). Presumably most large projects read environment variables in one place or another.

You would have to run go test with env -i (https://man7.org/linux/man-pages/man1/env.1.html) and be explicit about which environment variables to pass to go test.

nfi-hashicorp commented 1 year ago

I'm assuming you're working on this for Consul. From my experience Consul would benefit significantly from splitting its two largest packages (agent/consul, and agent if I remember correctly) into smaller packages. That would allow more tests to run in parallel and also make better use of the package cache.

I do agree it is looking to be a pragmatic idea. I'm struggling to get all the unit tests to pass in the same run at the moment. 😅

nfi-hashicorp commented 1 year ago

The cost of running them individually probably negates the benefit of caching in most cases, but it might be worth trying.

Yeah, then I'd have to run them in parallel, all the runner flags, etc. Honestly I'm almost at the point of cloning the caching logic and hacking it into gotestsum somehow. :P

nfi-hashicorp commented 1 year ago

In CI it's unlikely the test cache will be useful by default, because the environment variables set by the CI system will be different on each run (ex: build number). Presumably most large projects read environment variables in one place or another.

I could be wrong but I believe it hooks os.Getenv and only cares about the env vars (and files) that your tests access. That said, there's os.Environ() that gives you the whole shebang so :shrug:. I am definitely getting some caching happening in Github Actions atm

dnephin commented 1 year ago

Cool, maybe it is smart enough to only consider the env vars that are accessed as long as os.Environ is not called. I would expect env vars from the code under test to be significant to caching as well. I don't think any program would really be able to distinguish "test code" from "production code" when its all compiled into the same binary.

Since each package is run as a separate binary, maybe there are enough packages that don't access environment variables that it works more often then I expected.