run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.36k stars 4.98k forks source link

[Question]: 使用llama-index,在调用 KnowledgeGraphIndex.from_documents()方法是报错 #15033

Closed GOOD-N-LCM closed 1 month ago

GOOD-N-LCM commented 1 month ago

Question Validation

Question

使用llama-index,在调用 KnowledgeGraphIndex.from_documents()方法是报错 报错内容如下: Traceback (most recent call last): File "D:\code_llm\LLama-Index\lcm3.py", line 56, in kg_index = KnowledgeGraphIndex.from_documents( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\base.py", line 145, in from_documents return cls( ^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\knowledge_graph\base.py", line 109, in init super().init( File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\base.py", line 94, in init index_struct = self.build_index_from_nodes( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\base.py", line 216, in build_index_from_nodes return self._build_index_from_nodes(nodes, build_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\knowledge_graph\base.py", line 215, in _build_index_from_nodes triplets = self._extract_triplets( ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\knowledge_graph\base.py", line 159, in _extract_triplets return self._llm_extract_triplets(text) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\indices\knowledge_graph\base.py", line 163, in _llm_extract_triplets response = self._llm.predict( ^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 230, in wrapper result = func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\llms\llm.py", line 559, in predict chat_response = self.chat(messages) ^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 230, in wrapper result = func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\core\llms\callbacks.py", line 172, in wrapped_llm_chat f_return_val = f(_self, messages, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\llms\ollama\base.py", line 261, in chat response = self.client.chat( ^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\ollama_client.py", line 235, in chat return self._request_stream( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\ollama_client.py", line 98, in _request_stream return self._stream(*args, *kwargs) if stream else self._request(args, kwargs).json() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\ollama_client.py", line 74, in _request raise ResponseError(e.response.text, e.response.status_code) from None ollama._types.ResponseError Exception ignored in: <function NebulaGraphStore.del at 0x000002D6D22E2840> Traceback (most recent call last): File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\llama_index\graph_stores\nebula\nebula_graph_store.py", line 242, in del File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\nebula3\gclient\net\SessionPool.py", line 326, in close File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\nebula3\gclient\net\Session.py", line 351, in _sign_out File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\nebula3\gclient\net\Connection.py", line 262, in signout File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\nebula3\graph\GraphService.py", line 1603, in signout File "C:\Users\gogoLcm.conda\envs\llama-index\Lib\site-packages\nebula3\graph\GraphService.py", line 1606, in send_signout AttributeError: 'NoneType' object has no attribute 'CALL'

我的代码: `documents = SimpleDirectoryReader( "data" ).load_data()

graph_store = NebulaGraphStore( space_name=space_name, edge_types=edge_types, rel_prop_names=rel_prop_names, tags=tags )

storage_context = StorageContext.from_defaults(graph_store=graph_store)

kg_index = KnowledgeGraphIndex.from_documents( documents, storage_context=storage_context, max_triplets_per_chunk=10, include_embeddings=True, ) print('>>> kg_index : ', kg_index )`

dosubot[bot] commented 1 month ago

Hey @GOOD-N-LCM! I'm here to assist you with any bugs, questions, or contributions you have regarding the llama-index repository. I'm currently awaiting a human maintainer, but I'll do my best to help you out in the meantime.

要解决在调用 KnowledgeGraphIndex.from_documents() 方法时遇到的 ollama._types.ResponseError 错误,请确保正确设置了 KnowledgeGraphIndex 类所需的参数和配置,包括提供适当的服务上下文和存储上下文。

以下是一个设置和使用 KnowledgeGraphIndex 的示例:

import os
from llama_index.core.schema import Document, BaseNode
from llama_index.core.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.core.service_context import ServiceContext
from llama_index.core.storage.storage_context import StorageContext

# 创建文档列表
documents = [
    Document(text="Document content 1", metadata={"title": "Doc 1"}),
    Document(text="Document content 2", metadata={"title": "Doc 2"})
]

# 将文档转换为节点
nodes = [BaseNode.from_document(doc) for doc in documents]

# 设置服务上下文和存储上下文
service_context = ServiceContext()
storage_context = StorageContext()

# 初始化 KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
    nodes=nodes,
    service_context=service_context,
    storage_context=storage_context,
    show_progress=True
)

# 从节点构建索引
kg_index.build_index_from_nodes(nodes)

# 现在可以使用索引进行进一步操作

确保 ServiceContextStorageContext 已正确设置并包含所有必要的配置。如果这些上下文缺失或配置不正确,可能会导致 ResponseError 错误 [1][2][3]

To continue talking to Dosu, mention @dosu.

GOOD-N-LCM commented 1 month ago
service_context = ServiceContext()

请问 service_context = ServiceContext() 这一步是必须的么? 使用llama-index + nebula , 在 nebula社区代码里没有发现这个参数。

dosubot[bot] commented 1 month ago

Yes, the service_context = ServiceContext() step is necessary when using llama-index with Nebula. It initializes the required components for the KnowledgeGraphIndex to function correctly.

Here is an example of how to set up the ServiceContext:

from llama_index.core.service_context_elements.llm_predictor import LLMPredictor
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.indices.prompt_helper import PromptHelper
from llama_index.core.node_parser.interface import NodeParser
from llama_index.core.service_context_elements.llama_logger import LlamaLogger
from llama_index.core.callbacks.base import CallbackManager
from llama_index.core.schema import TransformComponent
from llama_index.core.service_context import ServiceContext

# Initialize the components
llm_predictor = LLMPredictor()
embed_model = BaseEmbedding()
prompt_helper = PromptHelper()
node_parser = NodeParser()
llama_logger = LlamaLogger()
callback_manager = CallbackManager()

# Create the ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

This setup ensures that all necessary components are initialized and available for the KnowledgeGraphIndex.from_documents() method to work correctly [1][2].

To continue talking to Dosu, mention @dosu.

GOOD-N-LCM commented 1 month ago

是的,service_context = ServiceContext()使用 Nebula 时此步骤是必需的llama-index。它会初始化所需的组件,以KnowledgeGraphIndex确保其正常运行。

以下是如何设置的示例ServiceContext

from llama_index.core.service_context_elements.llm_predictor import LLMPredictor
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.indices.prompt_helper import PromptHelper
from llama_index.core.node_parser.interface import NodeParser
from llama_index.core.service_context_elements.llama_logger import LlamaLogger
from llama_index.core.callbacks.base import CallbackManager
from llama_index.core.schema import TransformComponent
from llama_index.core.service_context import ServiceContext

# Initialize the components
llm_predictor = LLMPredictor()
embed_model = BaseEmbedding()
prompt_helper = PromptHelper()
node_parser = NodeParser()
llama_logger = LlamaLogger()
callback_manager = CallbackManager()

# Create the ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

此设置确保所有必要的组件都已初始化并可用于该KnowledgeGraphIndex.from_documents()方法正常工作[1] [2]

要继续与Dosu交谈,请提及@dosu。

好的,是我看文档的问题。确实需要这个service_context。 将您的代码加入到我的代码中 NodeParser 这个类的实例化我没有解决成功,请问这个是什么原因呀?

Traceback (most recent call last): File "D:\code_llm\LLama-Index\lcm3.py", line 68, in node_parser = NodeParser() ^^^^^^^^^^^^ TypeError: Can't instantiate abstract class NodeParser with abstract method _parse_nodes

dosubot[bot] commented 1 month ago

The error you're encountering is because NodeParser is an abstract class and cannot be instantiated directly. You need to use one of its concrete implementations, such as HTMLNodeParser, JSONNodeParser, MarkdownNodeParser, or SimpleFileNodeParser.

Here is an example of how to instantiate a SimpleFileNodeParser:

from llama_index.core.node_parser.file.simple_file import SimpleFileNodeParser

# Instantiate the SimpleFileNodeParser
node_parser = SimpleFileNodeParser.from_defaults()

# Now you can use this node_parser in your ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

Replace SimpleFileNodeParser with any other specific NodeParser implementation if needed, such as HTMLNodeParser, JSONNodeParser, or MarkdownNodeParser [1].

To continue talking to Dosu, mention @dosu.

GOOD-N-LCM commented 1 month ago

您遇到的错误是因为NodeParser是抽象类,无法直接实例化。您需要使用它的具体实现之一,例如HTMLNodeParserJSONNodeParserMarkdownNodeParserSimpleFileNodeParser

下面是一个如何实例化的示例SimpleFileNodeParser

from llama_index.core.node_parser.file.simple_file import SimpleFileNodeParser

# Instantiate the SimpleFileNodeParser
node_parser = SimpleFileNodeParser.from_defaults()

# Now you can use this node_parser in your ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

SimpleFileNodeParser如果需要,请用任何其他特定实现替换NodeParser,例如HTMLNodeParserJSONNodeParserMarkdownNodeParser [1]

要继续与Dosu交谈,请提及@dosu。

抱歉 ,按照您的方法 ,依旧是那个错误 T_T

dosubot[bot] commented 1 month ago

The error persists because NodeParser is an abstract class and cannot be instantiated directly. You need to use one of its concrete implementations. Here is an example using SimpleFileNodeParser:

from llama_index.core.node_parser.file.simple_file import SimpleFileNodeParser
from llama_index.core.service_context import ServiceContext
from llama_index.core.service_context_elements.llm_predictor import LLMPredictor
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.indices.prompt_helper import PromptHelper
from llama_index.core.service_context_elements.llama_logger import LlamaLogger
from llama_index.core.callbacks.base import CallbackManager

# Instantiate the SimpleFileNodeParser
node_parser = SimpleFileNodeParser.from_defaults()

# Initialize other components
llm_predictor = LLMPredictor()
embed_model = BaseEmbedding()
prompt_helper = PromptHelper()
llama_logger = LlamaLogger()
callback_manager = CallbackManager()

# Create the ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

Ensure you replace SimpleFileNodeParser with any other specific NodeParser implementation if needed, such as HTMLNodeParser, JSONNodeParser, or MarkdownNodeParser [1][2].

To continue talking to Dosu, mention @dosu.

GOOD-N-LCM commented 1 month ago

错误仍然存​​在,因为NodeParser是抽象类,无法直接实例化。您需要使用它的具体实现之一。以下是使用的示例SimpleFileNodeParser

from llama_index.core.node_parser.file.simple_file import SimpleFileNodeParser
from llama_index.core.service_context import ServiceContext
from llama_index.core.service_context_elements.llm_predictor import LLMPredictor
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.indices.prompt_helper import PromptHelper
from llama_index.core.service_context_elements.llama_logger import LlamaLogger
from llama_index.core.callbacks.base import CallbackManager

# Instantiate the SimpleFileNodeParser
node_parser = SimpleFileNodeParser.from_defaults()

# Initialize other components
llm_predictor = LLMPredictor()
embed_model = BaseEmbedding()
prompt_helper = PromptHelper()
llama_logger = LlamaLogger()
callback_manager = CallbackManager()

# Create the ServiceContext
service_context = ServiceContext(
    llm_predictor=llm_predictor,
    prompt_helper=prompt_helper,
    embed_model=embed_model,
    transformations=[node_parser],
    llama_logger=llama_logger,
    callback_manager=callback_manager
)

# Set the global service context
set_global_service_context(service_context)

如果需要,请确保SimpleFileNodeParser用任何其他特定NodeParser实现进行替换,例如HTMLNodeParserJSONNodeParserMarkdownNodeParser [1] [2]

要继续与Dosu交谈,请提及@dosu。

感谢您的帮助。以上问题我已经解决了 但是我在运行它报错了 是nebula不允许我写入的意思吗? 我依照建立的 edge_types, rel_prop_names = ["type"], ["description"] tags = ["1"] 进行写入

会报错: Query failed. Query: INSERT VERTEX 1(name) VALUES "Philz":("Philz");INSERT VERTEX 1(name) VALUES "Berkeley":("Berkeley");INSERT EDGE type(description) VALUES "Philz"->"Berkeley"@-4425215382545939561:("Founded in");, Param: {}Error message: Query failed. Query: INSERT VERTEX 1(name) VALUES "Philz":("Philz");INSERT VERTEX 1(name) VALUES "Berkeley":("Berkeley");INSERT EDGE type(description) VALUES "Philz"->"Berkeley"@-4425215382545939561:("Founded in");, Param: {}Error message: SemanticError: No schema found for `1'

我一定要优先建立nebula的点才能写入成功吗?

dosubot[bot] commented 1 month ago

Yes, you need to predefine the schema in Nebula before writing data. To avoid the error "SemanticError: No schema found for 1", you should ensure that the tags and edge types are defined in the NebulaGraph space.

Here is how you can set up the schema:

  1. Define Tags and Properties: Use the CREATE TAG statement to define tags and their properties.

    CREATE TAG `1`(name string);
  2. Define Edge Types and Properties: Use the CREATE EDGE statement to define edge types and their properties.

    CREATE EDGE `type`(description string);
  3. Refresh the Schema: Use the refresh_schema method to refresh the schema in your NebulaGraph store.

    def refresh_schema(self) -> None:
       """
       Refreshes the NebulaGraph Store Schema.
       """
       tags_schema, edge_types_schema, relationships = [], [], []
       for tag in self.execute("SHOW TAGS").column_values("Name"):
           tag_name = tag.cast()
           tag_schema = {"tag": tag_name, "properties": []}
           r = self.execute(f"DESCRIBE TAG `{tag_name}`")
           props, types, comments = (
               r.column_values("Field"),
               r.column_values("Type"),
               r.column_values("Comment"),
           )
           for i in range(r.row_size()):
               property_defination = (
                   (props[i].cast(), types[i].cast())
                   if comments[i].is_empty()
                   else (props[i].cast(), types[i].cast(), comments[i].cast())
               )
               tag_schema["properties"].append(property_defination)
           tags_schema.append(tag_schema)
       for edge_type in self.execute("SHOW EDGES").column_values("Name"):
           edge_type_name = edge_type.cast()
           edge_schema = {"edge": edge_type_name, "properties": []}
           r = self.execute(f"DESCRIBE EDGE `{edge_type_name}`")
           props, types, comments = (
               r.column_values("Field"),
               r.column_values("Type"),
               r.column_values("Comment"),
           )
           for i in range(r.row_size()):
               property_defination = (
                   (props[i].cast(), types[i].cast())
                   if comments[i].is_empty()
                   else (props[i].cast(), types[i].cast(), comments[i].cast())
               )
               edge_schema["properties"].append(property_defination)
           edge_types_schema.append(edge_schema)

By following these steps, you can ensure that the schema is correctly set up in NebulaGraph, and you can avoid the "SemanticError: No schema found for 1" error [1][2][3].

To continue talking to Dosu, mention @dosu.

GOOD-N-LCM commented 1 month ago

Yes, you need to predefine the schema in Nebula before writing data. To avoid the error "SemanticError: No schema found for 1", you should ensure that the tags and edge types are defined in the NebulaGraph space.

Here is how you can set up the schema:

  1. Define Tags and Properties: Use the CREATE TAG statement to define tags and their properties.
    CREATE TAG `1`(name string);
  2. Define Edge Types and Properties: Use the CREATE EDGE statement to define edge types and their properties.
    CREATE EDGE `type`(description string);
  3. Refresh the Schema: Use the refresh_schema method to refresh the schema in your NebulaGraph store.
    def refresh_schema(self) -> None:
       """
       Refreshes the NebulaGraph Store Schema.
       """
       tags_schema, edge_types_schema, relationships = [], [], []
       for tag in self.execute("SHOW TAGS").column_values("Name"):
           tag_name = tag.cast()
           tag_schema = {"tag": tag_name, "properties": []}
           r = self.execute(f"DESCRIBE TAG `{tag_name}`")
           props, types, comments = (
               r.column_values("Field"),
               r.column_values("Type"),
               r.column_values("Comment"),
           )
           for i in range(r.row_size()):
               property_defination = (
                   (props[i].cast(), types[i].cast())
                   if comments[i].is_empty()
                   else (props[i].cast(), types[i].cast(), comments[i].cast())
               )
               tag_schema["properties"].append(property_defination)
           tags_schema.append(tag_schema)
       for edge_type in self.execute("SHOW EDGES").column_values("Name"):
           edge_type_name = edge_type.cast()
           edge_schema = {"edge": edge_type_name, "properties": []}
           r = self.execute(f"DESCRIBE EDGE `{edge_type_name}`")
           props, types, comments = (
               r.column_values("Field"),
               r.column_values("Type"),
               r.column_values("Comment"),
           )
           for i in range(r.row_size()):
               property_defination = (
                   (props[i].cast(), types[i].cast())
                   if comments[i].is_empty()
                   else (props[i].cast(), types[i].cast(), comments[i].cast())
               )
               edge_schema["properties"].append(property_defination)
           edge_types_schema.append(edge_schema)

By following these steps, you can ensure that the schema is correctly set up in NebulaGraph, and you can avoid the "SemanticError: No schema found for 1" error [1][2][3].

To continue talking to Dosu, mention @dosu.

OHHHHH! Thanks!!!!!!!!!!!!! 非常感觉您这一系列的帮助!代码已经能够全部正常运行了!!

dosubot[bot] commented 1 month ago

很高兴听到代码已经能够正常运行了!如果以后还有任何问题或需要进一步的帮助,请随时联系。祝你编码顺利!

To continue talking to Dosu, mention @dosu.

logan-markewich commented 1 month ago

I'm glad dosu could help!