Update audition to handle the possibility that multiple models may exist for a given model_group_id/train_end_time combination, which may arise from variation over random seeds (closing #800). To do so, this PR introduces an agg_type parameter to the Auditioner (and underlying DistanceFromBestTable) that can aggregate over metric values for the mean, best, or worst case, taking the worst case as the default.
A couple of notes:
As a result of this change, the "best distance" table (and related python classes/dataframes) will no longer keep track of the model_id, however this shouldn't be a problem since the (model_group_id, train_end_time) pair is generally the key of interest here regardless.
I haven't added unit tests for the metric aggregation functionality explicitly, but updated the existing tests (based on the typical case of a single model per group/date pair) which are passing
Update audition to handle the possibility that multiple models may exist for a given
model_group_id
/train_end_time
combination, which may arise from variation over random seeds (closing #800). To do so, this PR introduces anagg_type
parameter to theAuditioner
(and underlyingDistanceFromBestTable
) that can aggregate over metric values for themean
,best
, orworst
case, taking the worst case as the default.A couple of notes:
model_id
, however this shouldn't be a problem since the(model_group_id, train_end_time)
pair is generally the key of interest here regardless.