It might be useful to have an automated test that checks if the output of the LLM is in the format we want.
This would involve checking that it's a list and that it has rows that are dictionaries, and that those dictionaries have the right keys. Stuff like that.
Might be redundant because other tests will fail anyway if the format is wrong, but it would mean we could quickly see if that's why they're failing.
Requirements
A new script in the evaluation folder which defines a new assertion for promptfoo to run.
The test should fail if the output does not have the desired format and fields.
If possible, it should give the specific reason for failure as well.
Overview
It might be useful to have an automated test that checks if the output of the LLM is in the format we want.
This would involve checking that it's a list and that it has rows that are dictionaries, and that those dictionaries have the right keys. Stuff like that.
Might be redundant because other tests will fail anyway if the format is wrong, but it would mean we could quickly see if that's why they're failing.
Requirements