The current scheduler implementation specifies the profiling requirement by NeedProfile() returning true or false.
Also, when Engine.Init(), NeedProfile()s for all schedulers are invoked and if there is no scheduler requiring profiling, latency_estimator_ is not initialized.
However, when a job has a target SLO, a planner retrieves an expected latency from a job and checks if the remaining time is sufficient to consume the job.
Therefore, if the configuration of an engine is set to have a scheduler not requiring profiling and an SLO-specified job is requested, Engine.GetExpected will return a default value (0) and the SLO checking logic will not be correct.
I think the engine should always track the latency of every execution, as SLO-specified jobs can always be requested unless checking SLO incurs significant overhead.
I agree with your solution. ‘NeedProfile’ is deprecated since it originated from the old design. Previously, we used to specify SLOs per model on the offline phase, but that assumption is no longer valid.
The current scheduler implementation specifies the profiling requirement by
NeedProfile()
returningtrue
orfalse
. Also, whenEngine.Init()
,NeedProfile()
s for all schedulers are invoked and if there is no scheduler requiring profiling,latency_estimator_
is not initialized.However, when a job has a target SLO, a planner retrieves an expected latency from a job and checks if the remaining time is sufficient to consume the job.
Therefore, if the configuration of an engine is set to have a scheduler not requiring profiling and an SLO-specified job is requested,
Engine.GetExpected
will return a default value (0) and the SLO checking logic will not be correct.I think the engine should always track the latency of every execution, as SLO-specified jobs can always be requested unless checking SLO incurs significant overhead.