Closed eric-gardyn closed 7 months ago
Hi Eric,
If I understand what you're trying to do, you want to test the difference in performance of two different embedding models. Here's how you could do that:
ingest.config.ts
files, each one with a different embedder
function and embeddedContentStore
:// ingest.config-ada-02.ts - using ada-02
// imports and set up
export default {
embedder: async () => {
// Use dynamic import because `@azure/openai` is a ESM package
// and this file is a CommonJS module.
const { OpenAIClient, OpenAIKeyCredential } = await import("@azure/openai");
return makeOpenAiEmbedder({
openAiClient: new OpenAIClient(new OpenAIKeyCredential(OPENAI_API_KEY)),
deployment: "text-embedding-ada-002", // or whatever deployment name is
backoffOptions: {
numOfAttempts: 25,
startingDelay: 1000,
},
});
},
embeddedContentStore: () =>
makeMongoDbEmbeddedContentStore({
connectionUri: MONGODB_CONNECTION_URI,
databaseName: "ada-content", // or whatever database name is
}),
// other config the same
} satisfies Config;
// ingest.config-ada-02.ts - using text-embedding-03-large
// imports and set up same
export default {
embedder: async () => {
// Use dynamic import because `@azure/openai` is a ESM package
// and this file is a CommonJS module.
const { OpenAIClient, OpenAIKeyCredential } = await import("@azure/openai");
return makeOpenAiEmbedder({
openAiClient: new OpenAIClient(new OpenAIKeyCredential(OPENAI_API_KEY)),
deployment: "text-embedding-03-large", // or whatever deployment name is
backoffOptions: {
numOfAttempts: 25,
startingDelay: 1000,
},
});
},
embeddedContentStore: () =>
makeMongoDbEmbeddedContentStore({
connectionUri: MONGODB_CONNECTION_URI,
databaseName: "text-embedding-03-large-content", // or whatever database name is
}),
// other config the same
} satisfies Config;
You'll need to set up the Atlas Vector Search indexes for each of these embedded content collections - https://mongodb.github.io/chatbot/mongodb#3-create-atlas-vector-search-index-required-for-rag.
You can give the indexes different names here, say "ada-02" and "text-embedding-03-large".
In your server config, use the same 2 embedders for different models and embedded content stores. Make sure to use the right Atlas Vector Search Index in addition for the relevant EmbeddedContentStore. You could either set up two server instances to run side by side or test in sequence.
// server config using ada-02
// rest the same
const embeddedContentStore = makeMongoDbEmbeddedContentStore({
connectionUri: MONGODB_CONNECTION_URI,
databaseName: "ada-content",
});
const embedder = makeOpenAiEmbedder({
openAiClient,
deployment: "text-embedding-ada-02",
backoffOptions: {
numOfAttempts: 3,
maxDelay: 5000,
},
});
// rest the same
// server config using text-embedding-03-large
// rest the same
const embeddedContentStore = makeMongoDbEmbeddedContentStore({
connectionUri: MONGODB_CONNECTION_URI,
databaseName: "text-embedding-03-large-content",
});
const embedder = makeOpenAiEmbedder({
openAiClient,
deployment: "text-embedding-03-large",
backoffOptions: {
numOfAttempts: 3,
maxDelay: 5000,
},
});
// rest the same
to make testing multiple embedding models side-by-side easier in the future, we're planning to support generating multiple embeddings from different models using the same ingest config file.
however, there's no immediate plan to implement that. so to test multiple models now, i recommend using the above approach.
i'll leave this issue open for now in case you have any additional questions.
thanks for raising all these issue. we really appreciate your interest in the project.
I ended up modifying the 'makeApp' function, to a custom function like
export interface LLLConfig
extends Omit<AppConfig, 'maxRequestTimeoutMs' | 'serveStaticSite'> {
apiPrefix: string
}
export interface ServerConfig
extends Omit<AppConfig, 'conversationsRouterConfig'> {}
export const makeAppAPI = async ({
app,
serverConfig,
llmConfigs,
}: {
app: Express
serverConfig: ServerConfig
llmConfigs: LLLConfig[]
}): Promise<Express> => {
const {
maxRequestTimeoutMs = DEFAULT_MAX_REQUEST_TIMEOUT_MS,
corsOptions,
serveStaticSite,
} = serverConfig
logger.info('Server has the following configuration:')
logger.info(
stringifyFunctions(
cloneDeep(llmConfigs) as unknown as Record<string, unknown>
)
)
app.use(makeHandleTimeoutMiddleware(maxRequestTimeoutMs))
app.set('trust proxy', true)
app.use(cors(corsOptions))
app.use(express.json())
app.use(reqHandler)
if (serveStaticSite) {
app.use('/', express.static(path.join(__dirname, '..', 'static')))
}
app.get('/health', (_req, res) => {
const data = {
uptime: process.uptime(),
message: 'Ok',
date: new Date(),
}
res.status(200).send(data)
})
app.use(errorHandler)
llmConfigs.forEach((llmConfig) => {
app.use(
`${llmConfig.apiPrefix}/conversations`,
makeConversationsRouter(llmConfig.conversationsRouterConfig)
)
})
app.all('*', (req, res, _next) => {
return sendErrorResponse({
reqId: getRequestId(req),
res,
httpStatus: 404,
errorMessage: 'Not Found',
})
})
return app
}
and an example of a LLMConfig:
const llmConfig3: LLLConfig = {
conversationsRouterConfig: {
llm: llm3,
conversations,
generateUserPrompt,
middleware: [someMiddleware],
},
apiPrefix: '/api/gpt35',
}
nice! the implementation makes sense to me.
mongodb-chatbot-server uses VECTOR_SEARCH_INDEX_NAME from .env variables. but mongodb-rag-ingest does not. I wanted to setup 2 environment: one with text-embedding-ada-002 and one with text-embedding-3-large, so I could test the difference in performance. I was wondering if I could use VECTOR_SEARCH_INDEX_NAME for that purpose (basically having 2 "embedded_content" collections, each with its search index).