Choosing the right metrics

adamboazbecker commented 1 month ago

What are the right metrics to choose as a function of evaluation method and product requirements?

mkaramlou commented 1 month ago

I think this depends on the type of product or problem space. There are model performance metrics that are somewhat well defined mainly for traditional ML problems (there are thing like accuracy, precision, recall etc). The question is how do these model evaluation metrics, drive product outcomes and how important are they.

Although there are some developments on the systematic GenAI model evaluation methodologies (ROUGE, BERT, etc) the most common evaluation process for companies still requires human input.

pushkargarg commented 4 hours ago

I think this depends on what the model is being used for. If we are using it for text generation then evaluation can consist of metrics like percentage of predictions with grammatical errors, percentage of hallucinations etc. If there is a certain tone expected in the output then measuring that and trying to optimize for that is a good start. I also tend to look at evaluation/monitoring after a model has gone live. Business metrics are the most important metrics to track, i.e. whether the model is driving user value and serving its intended purpose.

mlopscommunity / open-questions-ai-quality

Choosing the right metrics #17