Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
8.19k stars 663 forks source link

bug/Deprecation Warning in LangChain: OpenAIEmbeddings Deprecated #3378

Closed ron-unstructured closed 1 month ago

ron-unstructured commented 1 month ago

Describe the bug When using the OpenAIEmbeddingEncoder from the unstructured package, a deprecation warning is issued:

LangChainDeprecationWarning: The class `langchain_community.embeddings.openai.OpenAIEmbeddings` was deprecated in langchain-community 0.0.9 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run `pip install -U langchain-openai` and import as `from langchain_openai import OpenAIEmbeddings`.
  warn_deprecated(

To Reproduce

  1. run this code:
    
    import os

from unstructured.documents.elements import Text from unstructured.embed.openai import OpenAIEmbeddingConfig, OpenAIEmbeddingEncoder

Initialize the encoder with OpenAI credentials

embedding_encoder = OpenAIEmbeddingEncoder(config=OpenAIEmbeddingConfig(api_key=os.environ["OPENAI_API_KEY"]))

Embed a list of Elements

elements = embedding_encoder.embed_documents( elements=[Text("This is sentence 1"), Text("This is sentence 2")], )

Embed a single query string

query = "This is the query" query_embedding = embedding_encoder.embed_query(query=query)

Print embeddings

[print(e.embeddings, e) for e in elements] print(query_embedding, query) print(embedding_encoder.is_unit_vector(), embedding_encoder.num_of_dimensions())


2. output:
LangChainDeprecationWarning: The class `langchain_community.embeddings.openai.OpenAIEmbeddings` was deprecated in langchain-community 0.0.9 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run `pip install -U langchain-openai` and import as `from langchain_openai import OpenAIEmbeddings`.
  warn_deprecated(
[0.01760913596156327, -0.007775292635773074, 0.003654911836163474, -0.00401548617215707, -0.014750765454346578, 0.009079915359210477, -0.014160735357713351, -0.0005392224344869793, -0.008745564971118313, -0.04507834036097138, 0.009138918368873799, 0.03513960124357545, -0.014134511176981276, 0.0033287561553041237, -0.003612298706627057, -0.006280548327318386, 0.03115361704191132, -0.01014852697531715, 0.015852156830707065, 0.00342873363245407, -0.00675257287028625, -0.002727252630354652, -0.01792381867976766, -0.006968917239051767, -0.014790100794122128, -0.003933537702845104, 0.003645078001219587, -0.02555488204680458, 0.010961457951560194, -0.001989714311071195, -0.006513282576210809, -0.02365367271588965, -0.006205155437528157, -0.005254550306409412,...]

**Expected behavior**
No deprecation warning should be displayed. The code should use the updated `OpenAIEmbeddings` class from the `langchain-openai` package as suggested.

**Environment Info**
- Unstructured: 0.14.11.dev3
- langchain-community: 0.0.29
- langchain-openai: 0.0.5
ron-unstructured commented 1 month ago

@MthwRobinson FYI

MthwRobinson commented 1 month ago

Thanks @ron-unstructured - we'll pick this up ASAP