deepset-ai / haystack

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
17.96k stars 1.93k forks source link

Add Retrieval Augmented Generation evaluation features. #5126

Closed ZanSara closed 1 year ago

ZanSara commented 1 year ago

Discussed in https://github.com/deepset-ai/haystack/discussions/5076

Originally posted by **muazhari** June 5, 2023 Adding an evaluation system other than BEIR, like KILT: https://ai.facebook.com/tools/kilt/ or ALCE: https://arxiv.org/abs/2305.14627, could be useful. I need this to evaluate my LFQA/RAG/Re2G application. And, these RAG things are currently in hot time that you should not miss the care, I think.
bogdankostic commented 1 year ago

Closing this issue as we won't integrate KILT or ALCE. We're working on enabling evaluation of generative models in https://github.com/deepset-ai/haystack/issues/4546