Closed sfc-gh-satherton closed 3 years ago
20210311-095708-almiller-c56e059102618f3e compressed=True data_size=166925824 fail_fast=10 max_runs=1 priority=100 remaining=not_started runtime=0:02:43 sanity=False started=1075 submitted=20210311-095708 timeout=5400 username=almiller
max_runs=1 started=1075
This is a wee bit overkill
Currently, a correctness job is stopped when its observed completion count reaches
max_runs
. This logic means there are usually other tests still running at this time which were launched before the completion count hit the limit. Successes after the job is in a "stopped" state are still tallied later, though I'm not sure if errors are.Automation around joshua usually assumes that a stopped job's result stats are final. One way this happens is using
joshua tail
to detect when the bundle has completed, and then querying and reporting stats for the job id at that time, in which case the reported counts do not include the results of the tests which are still running.Running
joshua list --stopped
later will provided more updated results which include counts of, at least, the successful tests which completed after the bundle was placed in a stopped state. I am not certain if errors are tallied if the job is in a stopped state when the error is detected.Overrun is as much as 15% for a 10k run limit, or as much as 90% for a 1k run limit. Some recent examples: