microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.88k stars 1.72k forks source link

Use local embedding run local search #478

Closed 812406210 closed 2 months ago

812406210 commented 3 months ago

Describe the issue

text_embedder = OpenAIEmbedding( api_key="bge-base-zh", api_base="http://10.31.4.80:9997/v1", api_type=OpenaiApiType.OpenAI, deployment_name=None, model="bge-base-zh", max_retries=20, )


ValueError Traceback (most recent call last) Cell In[88], line 2 1 question = "Tell me about Dr. Jordan Hayes" ----> 2 result = await search_engine.asearch(question) 3 print(result.response)

File ~/PrivateProject/graphrag/graphrag/query/structured_search/local_search/search.py:66, in LocalSearch.asearch(self, query, conversation_history, kwargs) 63 start_time = time.time() 64 search_prompt = "" ---> 66 context_text, context_records = self.context_builder.build_context( 67 query=query, 68 conversation_history=conversation_history, 69 kwargs, 70 **self.context_builder_params, 71 ) 72 log.info("GENERATE ANSWER: %s. QUERY: %s", start_time, query) 73 try:

File ~/PrivateProject/graphrag/graphrag/query/structured_search/local_search/mixed_context.py:139, in LocalSearchMixedContext.build_context(self, query, conversation_history, include_entity_names, exclude_entity_names, conversation_history_max_turns, conversation_history_user_turns_only, max_tokens, text_unit_prop, community_prop, top_k_mapped_entities, top_k_relationships, include_community_rank, include_entity_rank, rank_description, include_relationship_weight, relationship_ranking_attribute, return_candidate_context, use_community_summary, min_community_rank, community_context_name, column_delimiter, **kwargs) 134 pre_user_questions = "\n".join( 135 conversation_history.get_user_turns(conversation_history_max_turns) 136 ) 137 query = f"{query}\n{pre_user_questions}" --> 139 selected_entities = map_query_to_entities( ... 2153 ) 2155 if k is not None and int(k) <= 0: 2156 raise ValueError(f"Nearest-K must be > 0 but got {k}")

ValueError: Query vector size 768 does not match index column size 1536

Steps to reproduce

No response

GraphRAG Config Used

No response

Logs and screenshots

No response

Additional Information

KylinMountain commented 3 months ago

did you config two embedding model in index and query?

karthik-codex commented 2 months ago

The local search with embeddings from Ollama now works. You can read full guide here: https://medium.com/@karthik.codex/microsofts-graphrag-autogen-ollama-chainlit-fully-local-free-multi-agent-rag-superbot-61ad3759f06f Here is the link to the repo: https://github.com/karthik-codex/autogen_graphRAG

natoverse commented 2 months ago

Consolidating alternate model issues here: #657

blackwhites commented 2 months ago

I use deepseek as llm and azure embedding for search,the global search is ok but for local search report ValueError: Query vector size 768 does not match index column size 1536,same error with above