Open pawiecz opened 4 months ago
Suggested setup: service running bisects keeps a bare checkout for each bisect (can use local clones from a master repo to give only a tiny disk space overhead). When setting up it can use 'git rev-list working..broken' to get a list of commits covered by the bisect and pull any existing results from the database to narrow down the bisect further.
Then my idea for a bisect step was to check for a build, if not ask for a new one. Once binaries are available a test job can be generated - if we keep track of test jobs well enough then we can share jobs between multiple bisects so if multiple tests in a single test job (eg, a single kselftest suite) fail then we can bisect them in parallel and share jobs up until the point where the bisects diverge (if they do). The job sharing can be really helpful when multiple tests break.
@broonie that is interesting, but the job sharing feels a bit more elaborated for a first MVP. Another idea in that direction is that we could try to find a build that is close enough to the current bisection step, eg we have a build for 5 commits ahead our step, so we can use that right away instead. I hope git bisect
can cope with that as well.
I would've thought that about the job sharing too - it came up when I was implementing my own bisection because I realised I was generating stable names for the jobs based on the commit IDs of the commits being tested so I could just shove them in a simple table and look there to see if the job already existed before resubmitting it.
Nearby commits are also interesting yeah - if you've got something event based you can just feed in any result that gets delivered for a commit covered by the bisect. The only risk there is that it makes the bisect log look less clean, it shouldn't impact the actual result though. You could potentially do something like check if there are more than N jobs scheduled that will report results for the test and suppress generation for new ones until those come in, though some might be for other bisects going down different branches.
@pawiecz it would be good scope the work and prepare a little roadmap, as it seems we will tacke this in steps.
Indeed, a few steps:
As described by @broonie in the first paragraph and beginning of the second.
Why? Reuse already available test results instead of submitting new jobs and waiting for data How?
Step A: Having the rev-list
the neighborhood thresholds could be set to decide whether there is relevant data already available or if there is a need to submit new jobs.
Step B: Listen on the events in the Maestro and filter relevant ones ("any result that gets delivered for a commit covered by the bisect").
Why? Because test execution takes significantly less time than DUT setup (deployment, provisioning) How? Combine test suites/cases into single TestJob definition: Action block of the job template already supports that. Reducing them as "the bisects diverge (if they do)" will still have to be implemented.
Why? To create a job results cache first instead of doubling submissions How? Implement queuing mechanism in the bisection service keeping in mind "some [TestJobs] might be for other bisects going down different branches"
Note: I reordered optimizations a bit and I'm not sure if (1B) in fact should take priority over (2) - task order might change in development.
The main thing I was thinking about with optimisation 2 is that it's really common for one bug to cause multiple tests in the same suite to fail - that'd trigger bisects for each test that fails, but if it's one underlying bug they'll all come up with the same answer and can share all their jobs. Even if it ends up as multiple commits there's a fair to middling chance they'll be somewhere nearby in the bisection (eg, for -next in the same tree) so it'll help for a lot of the bisection.
Oh, I see where I misunderstood your point on (2) and I get the reasons for (3) more clearly.
With current level of granularity retriggered tests would be:
@broonie Do you think combining even more Actions into single TestJob might potentially cause interference between test cases and therefore not be worth the setup time savings? Probably my take on (2) could have lower priority (1A > 3 > 1B > 2).
With initial draft described in https://github.com/kernelci/kernelci-core/issues/2594#issuecomment-2222422518 and the custom API endpoints introduced in https://github.com/kernelci/kernelci-pipeline/pull/691 (as well as checkouts https://github.com/kernelci/kernelci-pipeline/pull/590) Maestro should be able to run bisection jobs in its pipelines.
This task focuses on integrating these features as well as potential external efforts in this matter.