Set up baseline performance for RAG system

User Story

As a engineer, I would like use an off-the-shelf LLM like GPT-4, LangChain's retrievers, and Cohere embeddings to set up the baseline performance for our agentic RAG system so that we can experiment with different methods to improve the retrieval and generation.

Detailed Description

Establish the baseline performance for our agentic RAG system by using OpenAI GPT-4 for response generation and Cohere embeddings for document encoding, focusing exclusively on semantic retrieval without incorporating lexical search or reranking methods.

Acceptance Criteria

[ ] Given the RAG system is implemented with OpenAI GPT-4 and Cohere embeddings, then the system retrieves documents based solely on semantic similarity without using lexical search.
[ ] Given the performance metrics are recorded, then the baseline metrics (Recall@k, Precision@k, F1 Score, etc.) are accurately calculated and documented for comparison.

frieda-huang / csye7230