Open athewsey opened 5 months ago
I was hoping to use the default composer with Claude v3 on Bedrock by using a setup along the lines of:
runner = BedrockModelRunner(content_template='{"messages": $prompt, "anthropic_version": "bedrock-2023-05-31", "max_tokens": 500}')
...with a dataset where $prompt
has been pre-fulfilled as JSON message sequences like:
{
"prompt": "[{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \".............\"}]}]"
"answers": "2024-04-09"
}
...But unfortunately, the JsonContentComposer
results in the messages structure remaining stringified
As noted above, it's confusing that fmeval makes it seem like content_template
might support structured data, but in practice JsonContentComposer
forces everything to be enclosed in a string.
My current workaround for this (now open-sourced) is to override BedrockModelRunner._composer
with a StructuredContentComposer that supports providing arbitrary JSON.
Our app currently completely bypasses fmeval's prompt template fulfilment, because it isn't flexible enough to support multiple context data fields. Hopefully this changes in future!
I see from this comment this feature may be coming already, but the attached issue was closed with the interim workaround.
For our use-case we have a multi-field dataset for example:
...Where the final LLM prompt would combine both the source document and the question, in a (constant) template. I could easily see more general use-cases with other fields too.
Today, we're hacking around fmeval by doing prompt fulfilment as a separate step before the library. It would be much better if fmeval is able to directly process raw multi-field datasets like this, by taking a prompt template that can reference arbitrary fields from the source record.
My reason for revisiting this was Claude v3's messages API, which means we're going to have to do more sophisticated fulfilment on our side to achieve the same effect.