langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.76k stars 13.95k forks source link

RPC error: [create_index], <MilvusException: (code=1100, message=create index on 104 field is not supported: invalid parameter[expected=supported field][actual=create index on 104 field])>, <Time:{'RPC start': '2024-06-14 13:38:35.242645', 'RPC error': '2024-06-14 13:38:35.247294'}> #22901

Open eci-aashish opened 3 weeks ago

eci-aashish commented 3 weeks ago

Checked other resources

Example Code

from pymilvus import ( Collection, CollectionSchema, DataType, FieldSchema, WeightedRanker, connections, )

from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import PromptTemplate from langchain_core.runnables import RunnablePassthrough from langchain_milvus.retrievers import MilvusCollectionHybridSearchRetriever from langchain_milvus.utils.sparse import BM25SparseEmbedding

from langchain_openai import ChatOpenAI, OpenAIEmbeddings

import logging logger = logging.getLogger("gunicorn.error")

texts = [ "In 'The Whispering Walls' by Ava Moreno, a young journalist named Sophia uncovers a decades-old conspiracy hidden within the crumbling walls of an ancient mansion, where the whispers of the past threaten to destroy her own sanity.", "In 'The Last Refuge' by Ethan Blackwood, a group of survivors must band together to escape a post-apocalyptic wasteland, where the last remnants of humanity cling to life in a desperate bid for survival.", "In 'The Memory Thief' by Lila Rose, a charismatic thief with the ability to steal and manipulate memories is hired by a mysterious client to pull off a daring heist, but soon finds themselves trapped in a web of deceit and betrayal.", "In 'The City of Echoes' by Julian Saint Clair, a brilliant detective must navigate a labyrinthine metropolis where time is currency, and the rich can live forever, but at a terrible cost to the poor.", "In 'The Starlight Serenade' by Ruby Flynn, a shy astronomer discovers a mysterious melody emanating from a distant star, which leads her on a journey to uncover the secrets of the universe and her own heart.", "In 'The Shadow Weaver' by Piper Redding, a young orphan discovers she has the ability to weave powerful illusions, but soon finds herself at the center of a deadly game of cat and mouse between rival factions vying for control of the mystical arts.", "In 'The Lost Expedition' by Caspian Grey, a team of explorers ventures into the heart of the Amazon rainforest in search of a lost city, but soon finds themselves hunted by a ruthless treasure hunter and the treacherous jungle itself.", "In 'The Clockwork Kingdom' by Augusta Wynter, a brilliant inventor discovers a hidden world of clockwork machines and ancient magic, where a rebellion is brewing against the tyrannical ruler of the land.", "In 'The Phantom Pilgrim' by Rowan Welles, a charismatic smuggler is hired by a mysterious organization to transport a valuable artifact across a war-torn continent, but soon finds themselves pursued by deadly assassins and rival factions.", "In 'The Dreamwalker's Journey' by Lyra Snow, a young dreamwalker discovers she has the ability to enter people's dreams, but soon finds herself trapped in a surreal world of nightmares and illusions, where the boundaries between reality and fantasy blur.", ]

from langchain_openai import AzureOpenAIEmbeddings

dense_embedding_func: AzureOpenAIEmbeddings = AzureOpenAIEmbeddings( azure_deployment="****", openai_api_version="****", azure_endpoint="***", api_key="****", )

dense_embedding_func = OpenAIEmbeddings()

dense_dim = len(dense_embedding_func.embed_query(texts[1]))

logger.info(f"DENSE DIM - {dense_dim}")

print("DENSE DIM") print(dense_dim)

sparse_embedding_func = BM25SparseEmbedding(corpus=texts) sparse_embedding = sparse_embedding_func.embed_query(texts[1])

print("SPARSE EMBEDDING") print(sparse_embedding)

connections.connect(uri=CONNECTION_URI)

connections.connect( host="**", # Replace with your Milvus server IP port="*", user="**", password="*****", db_name="*****" )

print("CONNECTED")

pk_field = "doc_id" dense_field = "dense_vector" sparse_field = "sparse_vector" text_field = "text" fields = [ FieldSchema( name=pk_field, dtype=DataType.VARCHAR, is_primary=True, auto_id=True, max_length=100, ), FieldSchema(name=dense_field, dtype=DataType.FLOAT_VECTOR, dim=dense_dim), FieldSchema(name=sparse_field, dtype=DataType.SPARSE_FLOAT_VECTOR), FieldSchema(name=text_field, dtype=DataType.VARCHAR, max_length=65_535), ]

schema = CollectionSchema(fields=fields, enable_dynamic_field=False) collection = Collection( name="IntroductionToTheNovels", schema=schema, consistency_level="Strong" )

print("SCHEMA CRAETED")

dense_index = {"index_type": "FLAT", "metric_type": "IP"} collection.create_index("dense_vector", dense_index) sparse_index = {"index_type": "SPARSE_INVERTED_INDEX", "metric_type": "IP"} collection.create_index("sparse_vector", sparse_index)

print("INDEX CREATED") collection.flush()

print("FLUSHED")

entities = [] for text in texts: entity = { dense_field: dense_embedding_func.embed_documents([text])[0], sparse_field: sparse_embedding_func.embed_documents([text])[0], text_field: text, } entities.append(entity)

print("ENTITES") collection.insert(entities) print("INSERTED") collection.load() print("LOADED")

sparse_search_params = {"metric_type": "IP"} dense_search_params = {"metric_type": "IP", "params": {}} retriever = MilvusCollectionHybridSearchRetriever( collection=collection, rerank=WeightedRanker(0.5, 0.5), anns_fields=[dense_field, sparse_field], field_embeddings=[dense_embedding_func, sparse_embedding_func], field_search_params=[dense_search_params, sparse_search_params], top_k=3, text_field=text_field, )

print("RETRIEVED CREATED")

documents = retriever.invoke("What are the story about ventures?")

print(documents)

Error Message and Stack Trace (if applicable)

RPC error: [create_index], <MilvusException: (code=1100, message=create index on 104 field is not supported: invalid parameter[expected=supported field][actual=create index on 104 field])>, <Time:{'RPC start': '2024-06-14 13:38:35.242645', 'RPC error': '2024-06-14 13:38:35.247294'}>

Description

I am trying to use hybrid search in milvus database using langchain-milvus library.

But when I created index for sparse vector field, it gives an error -

RPC error: [create_index], <MilvusException: (code=1100, message=create index on 104 field is not supported: invalid parameter[expected=supported field][actual=create index on 104 field])>, <Time:{'RPC start': '2024-06-14 13:38:35.242645', 'RPC error': '2024-06-14 13:38:35.247294'}>

I have tried milvusclient for create collection as well but that also gives me same error.

We have commited the implementation of hybrid search after finding langchain's document but it gives an error, we are stuck in middle now, so please resolve it as soon as possible.

System Info

pip freeze | grep langchain -

langchain-core==0.2.6 langchain-milvus==0.1.1 langchain-openai==0.1.8


Platform - linux


python version - 3.11.7


python -m langchain_core.sys_info

System Information

OS: Linux OS Version: #73~20.04.1-Ubuntu SMP Mon May 6 09:43:44 UTC 2024 Python Version: 3.11.7 (main, Dec 8 2023, 18:56:57) [GCC 9.4.0]

Package Information

langchain_core: 0.2.6 langsmith: 0.1.77 langchain_milvus: 0.1.1 langchain_openai: 0.1.8

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

eci-aashish commented 2 weeks ago

Aby Update on this?