Extended Build/Test Status Support for KCIDB

nuclearcat commented 2 weeks ago

Problem Statement

KCIDB needs to track and display detailed status information for builds and tests. Developers require visibility into the build/test progression and, when possible, estimated completion times.

Current Build/Test States

Builds and tests typically progress through three states:

Created and queued (awaiting available resources)
In progress (duration may be fixed or hardware-dependent)
Finished (success or failure)

Technical Constraint

Since KCIDB uses a write-once database, we cannot update existing build/test status records. Instead, we must add new records for state changes.

Proposed Solution

We can add the following timestamp fields:

Option 1: Absolute Timestamps

queued_at: Time when the build/test entered the queue
started_at: Time when the build/test execution began
finished_at: Time when the build/test completed

Option 2: Duration-Based Approach

queued_at: Time when the build/test entered the queue
duration_queued: Time spent in queue (started_at - queued_at)
duration_running: Execution time (finished_at - started_at)

Benefits

By analyzing the duration between states 2 and 3, we can calculate approximate ETAs for future builds/tests
Queue duration metrics (state 1 to 2) can help determine if resource upgrades are needed to improve processing speed
We can show this data (and current status) to developer

spbnick commented 2 weeks ago

We already have the "duration" field specifying the duration of the test (and build) execution in seconds. We also already have the "start_time" field for both builds and tests.

I would keep duration as is for now, because changing that needs a major schema version increase and additional discussion. Additionally, migrating the database and the older schema version data from duration to finished_at or similar might not be possible in all cases, as start_time could be missing, and would require dropping duration. However, we might not have that much data like that, if at all, and we can still change that later, of course.

Do we want to separate time before getting a machine, and time between getting a machine (beginning testing) and starting execution, as e.g. the attached libinput_test_states.tar.gz shows for libinput test execution?

The queued_at would better be named queue_time (typing those was not fun!), at this moment, since we already have start_time. However, I think that's quite confusing and queued_at, started_at, and finished_at would be clearer. We can change those on the next major version bump, and I have a bunch of other breaking changes queued (!) for that, like switching to status for builds, and dropping waived.

Regarding the ETA display, we can add the expected_duration field (to the already-existing "duration", which specifies actual execution time) and report it (hardcoded?) from Maestro until we have a system extrapolating the previous data (which is a whole other can of worms).

So, overall, I think this could be the plan:

[ ] Add a queue_time field to builds and tests, containing a timestamp
[ ] Add an expected_duration field to builds and tests containing a (floating-point) number of seconds

[ ] Specify the following interpretation for the test status:

If the test has `status` field
  The test is finished
Else if the test has `duration` field
  The test has finished (in specified time), but its status is still being decided
Else if the test has `start_time` field
  The test is executing (started at that moment)
Else if the test has `queue_time` field
  The test is scheduled (at that moment), but not executing yet

If the test has `expected_duration` field
  This is the time we expect(ed) the test to take

[ ] Get the above approved (not rejected) by the participating CI systems
[ ] Merge and deploy the changes
[ ] Once we start work on another major version update for I/O schema, rename (both build and test) fields as follows (up to discussion, of course):
- queue_time -> queued_at
- start_time -> started_at
- start_time + duration -> finished_at
- start_time + expected_duration -> expected_finish_at

spbnick commented 1 week ago

Let's agree on a plan here, and I'll send a proposal to the CI systems.

kernelci / kcidb