Open anthony2261 opened 3 weeks ago
We need to figure out a way to evaluate AI generated results so that we can compare quality if there are changes to the llm flow, prompts, etc..
Desperately needed.
We need to figure out a way to evaluate AI generated results so that we can compare quality if there are changes to the llm flow, prompts, etc..
Desperately needed.