Enhancing Existing RAG Logic for Superior AI Responses

Background

The Retrieval-Augmented Generation (RAG) system is foundational in leveraging our extensive hacking database, Pinecone, to enhance HackerGPT's capability in delivering detailed and precise responses. To boost the system's efficiency and response quality, we aim to refine our methodology by optimizing how we process data and execute queries within Pinecone.

Objective

Our goal is to upgrade the RAG system to ensure more accurate and contextually relevant AI responses. This includes refining our approach to utilizing advanced techniques for data embedding, and employing sophisticated querying methods. These enhancements are designed to improve the user experience, offering seamless interaction without the need for extensive fine-tuning of the base language model.

Actions and Considerations (ACC)

Advanced Data Handling and Embedding:
- [x] Develop a tool or script, ideally in JavaScript or Python, that can process and embed data from a range of formats (Markdown, PDF, TXT) using advanced techniques and platforms, such as unstructured.io. This tool is intended to improve the quality of vectors for Pinecone queries.
- [x] Create an advanced system for extracting and embedding data, focusing on generating the most accurate and relevant vectors possible. Which explicitly will use text-embedding-3-large for embedding to get most accurate and relevant vectors possible. This system must efficiently manage the intricacies of unstructured data, optimizing it for our RAG system.
Sophisticated Query Optimization and Execution:
- [x] Formulate a method to compile chat history and other relevant data into a singular, comprehensive query for Pinecone, enhancing the system's ability to discern context and relevance.
- [x] Apply advanced NLP techniques and algorithms to not only improve the embedding process but also to refine the querying mechanism. This involves using sophisticated models and methods to accurately understand and interpret user queries, ensuring that search results from Pinecone are as precise and relevant as possible.

Expected Outcomes

An improved RAG system capable of delivering superior AI responses through enhancements in data processing, embedding techniques, and query precision.
A cost-effective and efficient alternative to extensive base LLM fine-tuning, leveraging cutting-edge technological advancements for enhanced performance.

This part of the code is executed within "app/api/chat/mistral/route.ts" to check if the last message sent by the user meets all the criteria. We check whether the message is between 50 and 1500 characters long. If the message is not in English, we translate it using a dictionary of popular 100 words which may not be the most accurate method. We then send the message to Pinecone, and if we get a response, the system message is edited to include two prompts - one for HackerGPT instruction and the other for Pinecone inscription and RAG context.

if (!latestUserMessage.startsWith("Assist with the user's query:")) {
      if (
        llmConfig.usePinecone &&
        cleanedMessages.length > 0 &&
        cleanedMessages[cleanedMessages.length - 1].role === "user" &&
        cleanedMessages[cleanedMessages.length - 1].content.length >
          llmConfig.pinecone.messageLength.min &&
        cleanedMessages[cleanedMessages.length - 1].content.length <
          llmConfig.pinecone.messageLength.max
      ) {
        if (!(await isEnglish(latestUserMessage))) {
          latestUserMessage = await translateToEnglish(
            latestUserMessage,
            openRouterUrl,
            openRouterHeaders,
            llmConfig.models.translation
          )
        }

        const pineconeResults = await queryPineconeVectorStore(
          latestUserMessage,
          llmConfig.openai.apiKey,
          llmConfig.pinecone
        )

        if (pineconeResults !== "None") {
          modelTemperature = pineconeTemperature

          cleanedMessages[0].content =
            `${llmConfig.systemPrompts.hackerGPT} ` +
            `${llmConfig.systemPrompts.pinecone} ` +
            `RAG Context:\n ${pineconeResults}`
        }
      }
    }

hackerai-tech / PentestGPT