Originally, a flaky test would appear in the report by its just example id which is not helpful at all.
Then we merged https://github.com/skroutz/rspecq/commit/9f6816e4f82f6d6da6952e552b2bc0ad777f0d36 which tried to replace the flaky example id with a reproduction command. The command did not work however because the premise was wrong. You see, to correctly reproduce a flaky test you need more than the seed and the test file. You also need all the files the preceded the flaky test, up to the point in time when you reset rspec and the databases. For rspecq, this point in time is the start of a worker.
Both of the above issues are mitigiated:
Flaky tests are reported by their example location, which by itself does not offer anything, but at least it shows the developer which test exactly is flaky.
Provides a reproduction command for each flaky test, which is the gist.
The reproduction (or rerun) command, consists of the seed and the files originally tested by the worker when the test failed.
When you pass files and individual examples, rspec ignores the files. I tried to debug this and it is not easy to understand rspec's code, believe me. Although it seems a bit buggy to me, it does not really matter because
It groups individual examples coming from the same file. For example, in the above command, rspec would execute examples bar_spec:4 and bar_spec:6 together because they are defined in the same example group.
The above render this option completely unusable, because we want to be able to run the examples in the exact same order as they did in the rspecq worker in the ci.
The above does not work either because rspecq uses rspec to group examples by files and then shuffles the files intentionally.
With RSpecQ and the new reproduction flag
We introduce an update to rspecq with which a developer can pass examples and files to rspecq and always get the same result. Internally, rspecq simply publishes the files and examples exactly as given in the command.
The command is included in the report under the flaky tests section.
Its form is the not prettiest, but is not something we can avoid. Also, depending on anyone's needs, it may need tuning. However, most users should be able to use it seemlessly.
The report looks like:
Flaky jobs detected (count=1):
./spec/foo_spec.rb[1:20:5:1]
Originally, a flaky test would appear in the report by its just example id which is not helpful at all.
Then we merged https://github.com/skroutz/rspecq/commit/9f6816e4f82f6d6da6952e552b2bc0ad777f0d36 which tried to replace the flaky example id with a reproduction command. The command did not work however because the premise was wrong. You see, to correctly reproduce a flaky test you need more than the seed and the test file. You also need all the files the preceded the flaky test, up to the point in time when you reset rspec and the databases. For rspecq, this point in time is the start of a worker.
Both of the above issues are mitigiated:
The reproduction (or rerun) command, consists of the seed and the files originally tested by the worker when the test failed.
A note regarding the reproduction command
With RSpec
This does not work for two reasons:
bar_spec:4
andbar_spec:6
together because they are defined in the same example group.The above render this option completely unusable, because we want to be able to run the examples in the exact same order as they did in the rspecq worker in the ci.
With RSpecQ
The above does not work either because rspecq uses rspec to group examples by files and then shuffles the files intentionally.
With RSpecQ and the new
reproduction
flagWe introduce an update to rspecq with which a developer can pass examples and files to rspecq and always get the same result. Internally, rspecq simply publishes the files and examples exactly as given in the command.
The command is included in the report under the flaky tests section.
Its form is the not prettiest, but is not something we can avoid. Also, depending on anyone's needs, it may need tuning. However, most users should be able to use it seemlessly.
The report looks like:
and will now look like:
A dev can run the command above with any build and worker id and will get the failure!