Closed RostyslavManko closed 5 months ago
This part of the code is executed within "app/api/chat/mistral/route.ts" to check if the last message sent by the user meets all the criteria. We check whether the message is between 50 and 1500 characters long. If the message is not in English, we translate it using a dictionary of popular 100 words which may not be the most accurate method. We then send the message to Pinecone, and if we get a response, the system message is edited to include two prompts - one for HackerGPT instruction and the other for Pinecone inscription and RAG context.
if (!latestUserMessage.startsWith("Assist with the user's query:")) {
if (
llmConfig.usePinecone &&
cleanedMessages.length > 0 &&
cleanedMessages[cleanedMessages.length - 1].role === "user" &&
cleanedMessages[cleanedMessages.length - 1].content.length >
llmConfig.pinecone.messageLength.min &&
cleanedMessages[cleanedMessages.length - 1].content.length <
llmConfig.pinecone.messageLength.max
) {
if (!(await isEnglish(latestUserMessage))) {
latestUserMessage = await translateToEnglish(
latestUserMessage,
openRouterUrl,
openRouterHeaders,
llmConfig.models.translation
)
}
const pineconeResults = await queryPineconeVectorStore(
latestUserMessage,
llmConfig.openai.apiKey,
llmConfig.pinecone
)
if (pineconeResults !== "None") {
modelTemperature = pineconeTemperature
cleanedMessages[0].content =
`${llmConfig.systemPrompts.hackerGPT} ` +
`${llmConfig.systemPrompts.pinecone} ` +
`RAG Context:\n ${pineconeResults}`
}
}
}
Background
The Retrieval-Augmented Generation (RAG) system is foundational in leveraging our extensive hacking database, Pinecone, to enhance HackerGPT's capability in delivering detailed and precise responses. To boost the system's efficiency and response quality, we aim to refine our methodology by optimizing how we process data and execute queries within Pinecone.
Objective
Our goal is to upgrade the RAG system to ensure more accurate and contextually relevant AI responses. This includes refining our approach to utilizing advanced techniques for data embedding, and employing sophisticated querying methods. These enhancements are designed to improve the user experience, offering seamless interaction without the need for extensive fine-tuning of the base language model.
Actions and Considerations (ACC)
Advanced Data Handling and Embedding:
text-embedding-3-large
for embedding to get most accurate and relevant vectors possible. This system must efficiently manage the intricacies of unstructured data, optimizing it for our RAG system.Sophisticated Query Optimization and Execution:
Expected Outcomes