Closed ETA444 closed 5 months ago
calculate_composite_score()
is a utility function that aggregates multiple evaluation metrics into a single score by weighting each metric according to its importance. This weighted approach facilitates a balanced and comprehensive evaluation of model performance across various criteria, enhancing decision-making in model selection within the predict_ml()
pipeline.
scores
and metric_weights
inputs are dictionaries, checks for their non-emptiness, and confirms that all metrics have corresponding weights.if not isinstance(scores, dict) or not isinstance(metric_weights, dict):
raise TypeError("Both 'scores' and 'metric_weights' must be dictionaries.")
if not scores or not metric_weights:
raise ValueError("'scores' and 'metric_weights' cannot be empty.")
missing_metrics = set(scores.keys()) - set(metric_weights.keys())
if missing_metrics:
raise ValueError(f"Missing weights for metrics: {', '.join(missing_metrics)}")
composite_score = sum(score * metric_weights.get(metric, 0) for metric, score in scores.items()) / sum(metric_weights.values())
try:
composite_score = sum(score * metric_weights.get(metric, 0) for metric, score in scores.items()) / sum(metric_weights.values())
except Exception as e:
raise ValueError(f"Error in calculating composite score: {e}")
scores = {'Accuracy': 0.95, 'Precision': 0.90}
metric_weights = {'Accuracy': 5, 'Precision': 1}
composite_score = calculate_composite_score(scores, metric_weights)
print(f"Composite Score: {composite_score:.2f}")
Composite Score Calculation
The composite score approach is used in the
model_recommendation_core()
, which is part of thepredict_ml()
pipeline for recommending the bestn
model(s).It aims to synthesize multiple scoring metrics into a single metric that can be used to compare and rank models. This is particularly useful when you have multiple criteria that you consider important for your model's performance, and these criteria might have different scales or directions (i.e., for some metrics, higher is better, while for others, lower is better).
Here’s a breakdown of the calculation:
Weight Assignment: Each metric is assigned a weight based on its importance. A higher weight (e.g., 5) is given to prioritized metrics, while a standard weight (e.g., 1) is assigned to others. This allows for the emphasis on metrics deemed more critical to the specific problem or domain.
Score Adjustment and Weighted Sum: Each metric's score is multiplied by its corresponding weight. If a metric benefits from being low (like RMSE), its score could be inverted (e.g., 1/score or a similar transformation) before applying the weight to align it with the "higher is better" principle. Then, these weighted scores are summed up to produce a composite score for each model.
Normalization: The sum of the weighted scores is divided by the sum of the weights. This normalization step ensures that the composite score is not unfairly influenced by the number of metrics or their assigned weights.
Composite Score Calculation Formula
Given a set of metrics $M$, where each metric $m \in M$ has a score $s_m$ and a weight $w_m$, the composite score $C$ for a model can be calculated as:
$$ C = \frac{\sum_{m \in M} (w_m \cdot \text{adj}(sm))}{\sum{m \in M} w_m} $$
Where:
This formula allows for a weighted synthesis of multiple performance metrics into a single, normalized score that facilitates direct comparison of models based on a balanced assessment of their performance across the prioritized criteria.
Concern: Handling Metrics Where Lower is Better
The concern about metrics where a lower value indicates better performance (like RMSE) is in my mind. However, the composite score calculation can accommodate such metrics through inversion or negation, ensuring that all metrics effectively operate in a "higher is better" framework for the composite score to be meaningful and consistent.
Inversion: For a metric where lower is better, one approach is to invert the score (e.g.,
1 / score
). This transformation means that a lower original score (which is better) results in a higher inverted score, aligning it with the composite score logic.Negation: Another approach is to use negation, especially if the scoring function directly supports it (e.g.,
neg_mean_squared_error
). The negative value ensures that optimization routines aiming to maximize the score are consistent across all metrics.When integrating such scores into the composite score calculation, the key is to ensure all metrics are on a consistent scale and direction so that the composite score effectively reflects the model's overall performance according to the prioritized criteria.
This approach allows for a nuanced comparison of models, balancing the trade-offs between different performance metrics in a way that aligns with the specific objectives and preferences for the modelling task at hand.
Review of Metrics' Adherence to 'Higher is Better' Framework
Classification Metrics
Regression Metrics