Suggestion: Improve schedule.execute Time-Budget Handling

Mascinissa commented 1 week ago

Hi,

I'm suggesting a tweak to Schedule.execute(nb_exec_times, max_mins_per_schedule, execution_timeout) to better fit different timing needs. Currently, the setup is a bit confusing, and there’s a mix of options that don’t quite cover some common use cases. Here’s a quick breakdown of the different use cases that I can think of:

In general, users wants to get as many measurements as possible within a given time budget. But in case of

Baseline Timing: Users often want at least one full run for baseline timing (e.g. initial execution time), even if it exceeds the budget they've set, to be able to compare with other schedules.
Quick Candidate Testing: For candidate schedules, users want early exit if it’s clear a schedule is too slow (e.g., if it’s 2x slower than the baseline), without waiting for the full run to finish. But we still want to gather as much data as possible within the budget.
Exact Number of Runs: For data gathering, users need a specific number of measurements, regardless of the time it takes..

It seem that use case 1 can be satisfied using the max_mins_per_schedule option (after quick fix[^1]). Use case 3 can be satisfied by not setting neither max_mins_per_schedule nor execution_timeout. However I don't see how use case 2 can be satisfied.

Other Issues with Current Approach :

Partial Results on Timeout: If execution_timeout is hit after some runs, it raises an exception and drops any completed measurements, which consumes compute time for no information in exchange.
Confusing Time Options: The coexistence of max_mins_per_schedule and execution_timeout can be confusing, and it’s not clear which one takes priority. It seems that execution_timeout is appliedfor the first run of the schedule separately and then reapplied on the remaining runs as a sum.
Mixed Time Units: The time units are inconsistent (minutes for max_mins_per_schedule, seconds for execution_timeout, milliseconds for execution times) and could all just be in milliseconds.

Proposed solution

A simpler solution might be to replace the current setup with Schedule.execute(max_runs, min_runs, time_budget)

max_runs limits total runs.
min_runs (default 0) guarantees at least this many runs.
time_budget sets a time limit (optional). After min_runs, it will keep going until max_runs or until time_budget is hit.

This unified time-budget mechanism should meet all three use cases while being user-friendly. I think we could use coreutils's timeout command. It should allow to save partial measurements if a timeout is triggered. I'll be working on that.

[^1]: There seems to be a mistake in this line. I think the author meant max_runs = max(0, max_runs - 1) instead of max_runs = min(0, max_runs - 1) otherwise it will always return 0 since max_runs is always>=0

Thanks!

skourta commented 1 week ago

Your proposed solution sounds good. Can you please open a PR when you are done implementing it. I will fix the bug in the max_runs meanwhilele.

Mascinissa commented 5 days ago

Hi,

I've just submitted a pull request that addresses this issue. You can view it here.

Please let me know if there are any questions or adjustments needed.

Thanks!

Tiramisu-Compiler / TiraLib

Suggestion: Improve schedule.execute Time-Budget Handling #36

Other Issues with Current Approach :

Proposed solution