Add evaluation root project

kbeaugrand commented 1 month ago

Implementing Evaluation Based on RAGAS Framework

Description

This pull request marks the beginning of our implementation of evaluation metrics for our Retrieval Augmented Generation (RAG) pipelines using the RAGAS framework.

Background

RAGAS (RAG Assessment) is a comprehensive framework designed to evaluate RAG pipelines. RAG pipelines utilize external data to enhance the context provided to Large Language Models (LLMs). While building these pipelines is facilitated by existing tools, evaluating their performance quantitatively remains a challenge. RAGAS addresses this gap by offering tools based on cutting-edge research to evaluate LLM-generated text and provide valuable insights into the effectiveness of RAG pipelines.

Features to be Implemented

The implementation will leverage Kernel Memory to deliver the following evaluation features:

Faithfulness: Ensuring the generated text accurately represents the source information.
Answer Relevancy: Assessing the pertinence of the answer in relation to the query.
Context Recall: Measuring the proportion of relevant context retrieved.
Context Precision: Evaluating the accuracy of the retrieved context.
Context Relevancy: Determining the relevance of the provided context to the query.
Context Entity Recall: Checking the retrieval of key entities within the context.
Answer Semantic Similarity: Comparing the semantic similarity between the generated answer and the expected answer.
Answer Correctness: Verifying the factual correctness of the generated answers.

Integration

RAGAS will be integrated into our CI/CD pipeline to enable continuous performance monitoring and evaluation of our RAG pipelines. This integration will ensure that our RAG systems consistently meet the desired performance benchmarks.

Next Steps

Implement evaluation metrics: Develop the specified evaluation features using Kernel Memory.
Unit tests: Tests the framework.
Integrate with CI/CD: Configure the evaluation checks to run automatically in our CI/CD pipeline.

dluc commented 1 month ago

Looks like the Release build is broken, maybe something's been removed from the solution?

kbeaugrand commented 1 month ago

Looks like the Release build is broken, maybe something's been removed from the solution?

I'll take a look asap.

dluc commented 1 month ago

Looks like the Release build is broken, maybe something's been removed from the solution?

I'll take a look asap.

no worries I just pushed a fix

microsoft / kernel-memory