Open bp3000bp opened 3 weeks ago
I have seen high variability in answer length based off different prompt tweaks. I would suggest starting off with tweaking the prompt, and ideally running evaluations across a number of questions to see the answer length, using the evaluator tools: https://github.com/Azure-Samples/ai-rag-chat-evaluator
From my experiments, our baseline prompt usually results in relatively short answers, but you could try giving more directive about how long the response should be.
Another parameter you can experiment with is the semantic ranker score threshold - perhaps it's returning documents that aren't relative at all, and have a semantic score less than 2 or 1.5. You could then set the threshold to filter out those results. Once again, you'd want to evaluate across multiple questions to ensure no degradation in answer quality on other questions.
This issue is for a: (mark with an
x
)Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
OS and Version?
azd version?
Versions
Mention any other details that might be useful