c0sogi / LLMChat

A full-stack Webui implementation of Large Language model, such as ChatGPT or LLaMA.
MIT License
245 stars 40 forks source link

text-embedding-ada-002 for embeddings #20

Closed Torhamilton closed 1 year ago

Torhamilton commented 1 year ago

openai says to use "text-embedding-ada-002" for all text embeddings. It's very cheap. gpt3.5/4 are 1000x more expensive tokenizer_model: str = "text-embedding-ada-002"

c0sogi commented 1 year ago

no. We always use the text-embedding-ada-002 model for embedding. See the OpenAIEembeddings class. You can see that model: str = "text-embedding-ada-002".

        embeddings: Embeddings = OpenAIEmbeddings(
            client=openai.Embedding,
            openai_api_key=openai_api_key,
        )
Torhamilton commented 1 year ago

Okay. Don't want to use ada-002 as tokenizer?