Handle multiple models for given model group/train end time pair (closes #800)

Update audition to handle the possibility that multiple models may exist for a given model_group_id/train_end_time combination, which may arise from variation over random seeds (closing #800). To do so, this PR introduces an agg_type parameter to the Auditioner (and underlying DistanceFromBestTable) that can aggregate over metric values for the mean, best, or worst case, taking the worst case as the default.

A couple of notes:

As a result of this change, the "best distance" table (and related python classes/dataframes) will no longer keep track of the model_id, however this shouldn't be a problem since the (model_group_id, train_end_time) pair is generally the key of interest here regardless.
I haven't added unit tests for the metric aggregation functionality explicitly, but updated the existing tests (based on the typical case of a single model per group/date pair) which are passing

dssg / triage

Handle multiple models for given model group/train end time pair (closes #800) #823