Closed jitokim closed 1 week ago
i found the response type is an array here
// wrap generation inside a Vec to match api-inference
Ok((headers, Json(vec![generation])).into_response())
Hi @jitokim, Thanks for the fix! LGTM, merged as 3c14fa63 after updating the ClientIT's prompt message to insist on providing strict JSON output so that the test assertion has more probability of passing.
@ilayaperumalg Hi. I was wondering how to compare Markdown format with the expected value, and you've solved it in a very smart way. I've learned a great approach from you. Thank you!
@jitokim Thank you for the kind words! It was the suggestion from @markpollack!
fix issue #1727 when using text-generation-inference models
I think the openapi.json needs to be fixed. you can find the response type is an array here
test models: microsoft/Phi-3-mini-4k-instruct mistralai/Mistral-7B-Instruct-v0.3 Qwen/Qwen2.5-Coder-32B-Instruct
microsoft/Phi-3-mini-4k-instruct
mistralai/Mistral-7B-Instruct-v0.3
Qwen/Qwen2.5-Coder-32B-Instruct