Open firelizzard18 opened 1 year ago
I updated the description to reflect the fact that go test
's -exec
flag makes this much easier than the kludge I had put together. This is what I'm using now:
Thank you for suggesting this feature! rr
looks very useful. The gotestsum --watch
mode (docs) has a key shortcut for rerunning the failed package and dropping into delve. That integration with rr
seems like it would be easy because watch mode already runs a single package at a time. d
could drop into either rr
or delve
, but that is missing one piece, recording the original execution before dropping into rr replay
.
How would you generally use this? Would you run the entire test suite of a project then use rr
to replay any failures?
Would you use it in CI and save the trace files as artifacts? I see under limitations that it emulates a single core, so I imagine if you did run it in CI you'd want to do that as a completely separate job and not the primary CI test run. Any interesting CI failures could probably be reproduced locally by rerunning that package individually a large number of times. That might be easier than duplicating the test runs, and saving many trace artifacts, for the rare chance you need the debugger.
Maybe this could be integrated with a --record
flag. In that mode gotestsum
would list the packages to run, run each individually with -exec rr record ...
, and save the trace files. When used with --watch
that could indicate that d
should use rr replay
instead of dlv
.
In both cases saving the file under the project directory sounds great. We could include a date and time in the filename instead of deleting it each time. That would produce a lot of files. Is there any reason to keep record files for successful runs? If we only keep the traces for a package with test failures that might help mitigate the problem of too many files. If someone were trying to trace down a flaky behaviour, they could run the tests in a loop many times and end up with only a few files for the failed runs. Maybe a --record=all
could be used to keep the successful runs in the rare case, and the default value for --record
would be failures
.
How would that work for your usage or rr
?
How would you generally use this? Would you run the entire test suite of a project then use rr to replay any failures?
My motivation is flaky tests that rarely fail when run on a developer’s computer but sporadically fail in CI. I’ve had these kinds of failures show up periodically and they’re a huge pain to debug because they’re often almost impossible to reproduce reliably. When one of these tests fail, I want to download the trace and step through it to debug the failure.
I see under limitations that it emulates a single core, so I imagine if you did run it in CI you'd want to do that as a completely separate job and not the primary CI test run.
The way that limitation is phrased is a bit misleading I think. I also made the assumption that rr would be useless for reproducing concurrency issues.
A better way of putting it is that only one instruction is running at any given time, but the code can still be concurrent. It is true that I’ve had trouble reproducing concurrency issues when running in normal mode. However rr also has ‘chaos mode’ which is very helpful for reproducing races.
Is there any reason to keep record files for successful runs?
I only have a use for records of failed runs. I could invent a reason for using record of a successful run but I think in the end that would essentially mean a test was succeeding when it really should be failing.
What you suggest would work perfectly for me, especially with a flag for chaos mode (or just a pass through -rrflags). It would be particularly helpful if I could selectively enable chaos mode for specific packages (config file?). I’m working on distributed systems and I think my simulator has numerous concurrency bugs, so being able to selectively enable chaos mode would allow me to work on those bugs incrementally.
It would be amazing if gotestsum supported running tests using rr. I've hacked something together to use with
--raw-command
but it's ugly. I am imagining something likegotestsum --engine rr
(with the default being--engine go
)./...
go list -f '{{if .TestGoFiles}}{{.ImportPath}}{{end}}' $1
to list only the packages containing test files-exec 'rr record -o $TRACE'
to thego test
invocationpkg/foo/bar
will be written to~/.local/share/rr/bar.test-0