Closed bnativi closed 1 month ago
PR #728 adds retries that just make the same LLM request as before. If a seed is set (say for OpenAI's API), then this retry logic should return the same mis-formatted response. However if a seed is not sent, then the LLM API's returns can vary from call to call even with the same input, so even just making the exact same call again might result in a valid return.
With the initial text generation metric PR, if an LLM provides an invalid response for one of our LLM guided metrics (wrong formatted, wrong data type, etc.), then Valor will raise an error and the rest of the evaluation will not be completed. This seems like a poor user experience, although we should collect some user feedback on this.
Two improvements could be made: