Open prescod opened 5 months ago
I think it would make sense to be able to return/print the prompt for the respective metric.
Perhaps even having a getter/setter would work (the setter for when we want to make minor changes to the built in prompt without having to write a whole new metric.
@lbux : I like your idea but just to be clear, what I'm asking for is to see the literal input and output of the LLM at runtime.
I also believe that that's what the unit of caching should be.
I do also like the idea of being able to read and write the prompt, however.
Or subclass.
As discussed on Discord, we need to know what prompts you are serving the evaluation LLM.
https://hamel.dev/blog/posts/prompt/
I need to see the prompt to help debug when the framework fails or even for debugging my own bugs.
Once I passed
evaluation_steps = steps.split()
when I meantevaluation_steps = steps.split("\n")
The former turns every word into a step, and the latter turns every line into a step. My error was obvious as soon as I looked at the LLM prompt.
The "true meaning" of various built-in metrics is also easier to understand when you read the prompt.