Answer Engine Quality - Ideas

Under Construction

The Answer Engine, released in version 0.13, provides a Q&A interface for Tabby's users to interact with the LLM, optionally within the context of a connected repository. The implementation is quite naive thus not really an performing one, we'd like to record ideas and improve the quality upon current baseline.

Prompt construction

Currently, the implementation is relatively simple. We collect snippets(instead of full article) from various sources and input them into a single LLM inference call to generate both the answer and the grounding (citation) information simultaneously.

collect references: Query the index, which includes both code and documents, to gather chunks. Deduplication is carried out at the document level, making sure that from each document, only the chunk with the highest score is chosen.
answer generation: Construct the prompt by combining the question and the content of each chunk to create a single LLM inference prompt.
relevant questions: Generate relevant questions for the request, the prompt is also based on question and chunk

Ideas

[ ] #3096
[ ] #3122
[ ] #3123

Integrations

[ ] #2894
[ ] #3097
[x] #2984

TabbyML / tabby