danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://docs.danswer.dev/
Other
10.37k stars 1.25k forks source link

Feature Request - Integration of Evaluation Pipeline for LLM Performance Tracking #979

Open AbirKorched opened 8 months ago

AbirKorched commented 8 months ago

I am inquiring about the potential integration of an evaluation pipeline to enhance the tracking of LLM performance.

Suggestion:

I propose the integration of an evaluation pipeline that leverages tools such as Trulens to streamline the evaluation process. This addition would not only aid in assessing the model's accuracy and effectiveness but also contribute to ongoing improvements.

Thank you for considering this feature request. I believe it would be a valuable addition to enhance the overall performance tracking and evaluation capabilities of the LLM.

tymonstuff commented 8 months ago

I would be really interested in seeing this added; on an enterprise level evaluation is fundamental - if we can get support/guidance on how to best implement this (even as a guide), i'd love to contribute.