langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
11.93k stars 2k forks source link

Migrate to text-embedding-3-large #4163

Closed ksjogo closed 2 months ago

ksjogo commented 6 months ago

Are there some utilities to migrate from ada embeddings to text-embedding-3-large?

dosubot[bot] commented 6 months ago

🤖

Hey @ksjogo, nice to cross paths with you again here in the world of code! How are things going on your end?

Based on the current information available in the langchainjs repository, there are no specific utilities or methods provided for migrating from ada embeddings to text-embedding-3-large. The Embeddings class and its subclasses, such as RandomEmbeddings and CacheBackedEmbeddings, are used for handling embeddings in the framework. However, these classes do not mention or provide support for ada embeddings or text-embedding-3-large.

Here is the relevant code for handling embeddings in the langchainjs framework:

class RandomEmbeddings extends Embeddings {
  async embedDocuments(documents: string[]): Promise<number[][]> {
    const quoteUnquoteEmbeddings = [];
    for (const document of documents) {
      quoteUnquoteEmbeddings.push(await this.embedQuery(document));
    }
    return quoteUnquoteEmbeddings;
  }

  async embedQuery(_document: string): Promise<number[]> {
    return [Math.random(), Math.random()];
  }
}

const embeddingsCache = CacheBackedEmbeddings.fromBytesStore(
  new RandomEmbeddings({}),
  new InMemoryStore()
);
const documents = ["How are you?", "I am fine", "I am LangChain"];
const result = await embeddingsCache.embedDocuments(documents);
const result2 = await embeddingsCache.embedDocuments(documents);
expect(result).toEqual(result2);

Source: langchain/src/embeddings/tests/cache.test.ts

The terms "ada embeddings" and "text-embedding-3-large" are not mentioned in the provided code or the langchainjs repository. If these terms refer to specific types of embeddings or models, you may need to implement custom methods or classes to handle the migration.

If you could provide more information or context about ada embeddings and text-embedding-3-large, I might be able to give a more specific answer or guidance.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

hinthornw commented 5 months ago

What specifically would you find helpful?

We have an indexing helper (https://js.langchain.com/docs/modules/data_connection/indexing/), but if you want to migrate to a new embedding type, you'll want to update all the embeddings at once, likely in a new index

dosubot[bot] commented 2 months ago

Hi, @ksjogo,

I'm helping the langchainjs team manage their backlog and am marking this issue as stale. From what I understand, you are seeking tools or utilities to aid in migrating from ada embeddings to text-embedding-3-large. The langchainjs repository does not currently provide specific utilities for this migration, and it seems that custom methods or classes may need to be implemented if ada embeddings and text-embedding-3-large refer to specific types of embeddings or models. There was also a mention of an indexing helper available in the langchainjs documentation, and it was suggested that updating all the embeddings at once in a new index may be necessary for the migration.

Could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let the langchainjs team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you!