Philipzhong1980 commented 10 months ago

System Info

Milvus standlone:2.3.2 Langchain version:0.0.325 OpenAI: 0.28.1 pymilvus: 2.3.2 Pydantic:2.4.2

Who can help?

No response

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[X] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[X] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

this is my testing coding import os import openai import langchain from langchain.chains import RetrievalQAWithSourcesChain from langchain.chat_models import AzureChatOpenAI from langchain.embeddings import OpenAIEmbeddings from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter from langchain.vectorstores import Milvus from langchain.document_loaders import WebBaseLoader

langchain.debug = True

MILVUS_HOST = "10.224.122.39" MILVUS_PORT = "19530"

open AI parameters setting

api_base = "https://model-serving.us-east-2.int.infra.intelligence.webex.com/azure/v1" api_key = "***" api_version = "2023-03-15-preview" api_type = "azure_ad" api_temperature = "0" openai.api_type = api_type openai.api_version = api_version openai.api_base = api_base openai.api_key = api_key os.environ["OPENAI_API_KEY"] = api_key os.environ["OPENAI_API_BASE"] = api_base os.environ["TOKENIZERS_PARALLELISM"] = "false"

def insert_vector2DB(): text_splitter_chunk_size = 2000 text_splitter_chunk_overlap = 200

loader = WebBaseLoader([
    "https://milvus.io/docs/overview.md",
    "https://help.webex.com/en-us/article/nk1shtj/Get-started-with-Webex-Calling",
    "https://help.webex.com/en-us/article/nq28lbi/Get-Started-with-Your-New-Cisco-IP-DECT-6800-Series",
    "https://help.webex.com/en-us/article/n62735y/Webex-%7C-Record-a-meeting-or-webinar",
])

documents = loader.load()
for doc in documents:
    doc.page_content = doc.page_content
    doc.metadata = {"url": doc.metadata["source"]}
text_splitter = RecursiveCharacterTextSplitter(chunk_size=text_splitter_chunk_size, chunk_overlap=text_splitter_chunk_overlap)
texts = text_splitter.split_documents(documents)

# split documents
#text_splitter = SentenceTransformersTokenTextSplitter(chunk_size=2000, chunk_overlap=200)
#text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap= 200)
#docs = text_splitter.split_documents(documents)
#print(documents)

# embeddings with Open AI
embeddings = OpenAIEmbeddings(
    deployment="gpt-35-turbo",
    openai_api_key=openai.api_key,
    openai_api_base=openai.api_base,
    openai_api_version=openai.api_version,
    openai_api_type=openai.api_type,
    chunk_size=16)
# call openai wrapper class,work-able with opensource embeddings
#embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L12-v2")
#print(embeddings)

# insert vector into milvus.
Milvus.from_documents(
    texts,
    embedding=embeddings,
    collection_name='milvus_help',
    drop_old=True,
    connection_args={"host": MILVUS_HOST, "port": MILVUS_PORT}
)

def query(question):

#1.connect the vector Database
#embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L12-v2")
embeddings = OpenAIEmbeddings(
    deployment="gpt-35-turbo",
    openai_api_key=openai.api_key,
    openai_api_base=openai.api_base,
    openai_api_version=openai.api_version,
    openai_api_type=openai.api_type,
    chunk_size=8)

vector_db = Milvus(
    embeddings,
    connection_args={"host": MILVUS_HOST, "port": MILVUS_PORT},
    collection_name='milvus_help'
)
#2.search the data
# top 20 similar vector data
retriever = vector_db.as_retriever(search_type="mmr", search_kwargs={"k": 20}).get_relevant_documents(question)

#put the retriever into LLM

# Configure system prompt
system_template = """Use the following pieces of context to answer the users question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
The "SOURCES" part should be a reference to the source of the document from which you got your answer.

Example of your response should be:

---

The answer is foo
SOURCES: xyz

---

Begin!
----------------
{summaries}"""
messages = [
    SystemMessagePromptTemplate.from_template(system_template),
    HumanMessagePromptTemplate.from_template("{question}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
chain_type_kwargs = {"prompt": prompt}

# Create an AzureChatOpenAI llm
llm = AzureChatOpenAI(
    temperature=api_temperature,
    openai_api_key=openai.api_key,
    openai_api_base=openai.api_base,
    openai_api_version=openai.api_version,
    openai_api_type=openai.api_type,
    deployment_name="gpt-35-turbo")

# Create a chain that uses the Chroma vector store
qa_chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="map_reduce",
    retriever=vector_db.as_retriever(),
    return_source_documents=True,
    chain_type_kwargs=chain_type_kwargs
)
print(qa_chain)
#response = qa_chain.run({"question": question})

result = qa_chain(
    {
        "question": question,
    },
    return_only_outputs=True,
)
response = f"Answer:\t{result['answer']}\n\nSources:\t{result['sources']}\n"

print(response)

def __test_query(): query("What is IP DECT 6800?") query("What is Webex Calling?") query("What is Milvus?")

def __test_init_db(): insert_vector2DB()

if name == 'main': __test_init_db()

__test_query()

Expected behavior

except it can work well, but it creates an error. Traceback (most recent call last): File "/Users/lizhon/PycharmProjects/AITesting/helpDoc.py", line 169, in test_init_db() File "/Users/lizhon/PycharmProjects/AITesting/helpDoc.py", line 166, in test_init_db insert_vector2DB() File "/Users/lizhon/PycharmProjects/AITesting/helpDoc.py", line 70, in insert_vector2DB Milvus.from_documents( File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/schema/vectorstore.py", line 438, in from_documents return cls.from_texts(texts, embedding, metadatas=metadatas, kwargs) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/vectorstores/milvus.py", line 827, in from_texts vector_db.add_texts(texts=texts, metadatas=metadatas) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/vectorstores/milvus.py", line 427, in add_texts embeddings = self.embedding_func.embed_documents(texts) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/embeddings/openai.py", line 490, in embed_documents return self._get_len_safe_embeddings(texts, engine=self.deployment) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/embeddings/openai.py", line 374, in _get_len_safe_embeddings response = embed_with_retry( File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/embeddings/openai.py", line 107, in embed_with_retry return _embed_with_retry(kwargs) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/tenacity/init.py", line 289, in wrapped_f return self(f, *args, kw) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/tenacity/init.py", line 379, in call do = self.iter(retry_state=retry_state) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/tenacity/init.py", line 314, in iter return fut.result() File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 437, in result return self.get_result() File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/_base.py", line 389, in get_result raise self._exception File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/tenacity/init.py", line 382, in call result = fn(args, kwargs) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/langchain/embeddings/openai.py", line 104, in _embed_with_retry response = embeddings.client.create(kwargs) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_resources/embedding.py", line 33, in create response = super().create(args, kwargs) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_resources/abstract/engine_apiresource.py", line 155, in create response, , api_key = requestor.request( File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_requestor.py", line 299, in request resp, got_stream = self._interpret_response(result, stream) File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_requestor.py", line 710, in _interpret_response self._interpret_response_line( File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_requestor.py", line 775, in _interpret_response_line raise self.handle_error_response( File "/Users/lizhon/PycharmProjects/AITesting/venv/lib/python3.8/site-packages/openai/api_requestor.py", line 428, in handle_error_response error_code=error_data.get("code"), AttributeError: 'str' object has no attribute 'get'

Process finished with exit code 1

dosubot[bot] commented 10 months ago

🤖

Hello @Philipzhong1980,

From the traceback you provided, it seems like the error is originating from the OpenAI's API request. Specifically, the error is raised when the API response is being interpreted. The error message AttributeError: 'str' object has no attribute 'get' suggests that the API response is a string, but the code is expecting a dictionary.

This could be due to a few reasons:

The API request to OpenAI is not successful, and the response is an error message in string format. You can check the status of the API request by printing the response before it's interpreted.
The OpenAI API has changed its response format. You can check the OpenAI API documentation to confirm the expected response format.
There might be a bug in the LangChain's OpenAI embedding code that incorrectly handles the API response.

To further investigate this issue, you could try the following:

Check the status of your OpenAI API key and ensure it's valid and has the necessary permissions.
Try making a simple API request to OpenAI outside of LangChain to see if it works as expected.
Update your LangChain and OpenAI packages to the latest versions, as the issue might have been fixed in a newer version.

I hope this helps! If you have any more information or if the issue persists, please feel free to share.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 7 months ago

Hi, @Philipzhong1980,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you encountered an error related to an AttributeError when using Langchain, OpenAI, and Milvus. Dosubot provided a detailed response with steps to investigate the issue further, including checking the API request status, updating packages, and making a simple API request outside of LangChain.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, kindly let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

langchain-ai / langchain

couldn't work from langchain,OpenAI and Milvus #12541