run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.3k stars 4.66k forks source link

[Feature Request]: support for `neo4j+s://` protocol to directly connect aura-hosted neo4j databases in `Neo4jPropertyGraphStore.url` #14225

Open rawwerks opened 2 weeks ago

rawwerks commented 2 weeks ago

Feature Description

support for neo4j+s:// protocol to directly connect aura-hosted neo4j databases in Neo4jPropertyGraphStore.url

Reason

in the neo4j console, users are given an instance url like this: neo4j+s://myhash.databases.neo4j.io

however, the llama_index source code only provides localhost examples: https://github.com/run-llama/llama_index/blob/8e46df7baf8440ac14f4be2ce75c3872584addff/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L111

and the guest blog post from neo4j shows a bolt:// url, which i have no idea how to find for my aura-hosted neo4j instance. https://github.com/tomasonjo/blogs/blob/8289fc3272625de35974ed355db44fa1c58a4e09/llm/llama_index_neo4j_custom_retriever.ipynb#L93C1-L93C46

if the neo4j+s://myhash.databases.neo4j.io is put in the url, this error occurs: ServiceUnavailable: Unable to retrieve routing information

Value of Feature

by supporting the url type that is immediately available in the aura neo4j console, this feature would make it easier for users who are interested in llamaindex property graphs to get started.

neo4j-athf-meme

rawwerks commented 2 weeks ago

here's the full trace:

----> 7 graph_store = Neo4jPropertyGraphStore(
      8     username=username,
      9     password=password,
     10     url=url,
     11 )

File ~/Documents/GitHub/llama_index/.venv/lib/python3.11/site-packages/llama_index/graph_stores/neo4j/neo4j_property_graph.py:151, in Neo4jPropertyGraphStore.__init__(self, username, password, url, database, refresh_schema, sanitize_query_output, enhanced_schema, **neo4j_kwargs)
    149 self.structured_schema = {}
    150 if refresh_schema:
--> 151     self.refresh_schema()

File ~/Documents/GitHub/llama_index/.venv/lib/python3.11/site-packages/llama_index/graph_stores/neo4j/neo4j_property_graph.py:159, in Neo4jPropertyGraphStore.refresh_schema(self)
    157 def refresh_schema(self) -> None:
    158     """Refresh the schema."""
--> 159     node_query_results = self.structured_query(
    160         node_properties_query,
    161         param_map={"EXCLUDED_LABELS": [*EXCLUDED_LABELS, BASE_ENTITY_LABEL]},
    162     )
    163     node_properties = (
    164         [el["output"] for el in node_query_results] if node_query_results else []
...
    800 # None of the routers have been successful, so just fail
    801 log.error("Unable to retrieve routing information")
--> 802 raise ServiceUnavailable("Unable to retrieve routing information")

ServiceUnavailable: Unable to retrieve routing information
tomasonjo commented 2 weeks ago

Take a look at this post: https://community.neo4j.com/t/unable-to-retrieve-routing-information/44203/10

rawwerks commented 2 weeks ago

thanks @tomasonjo, on mac it appears you need to use: url="neo4j+ssc://hash.databases.neo4j.io" => https://community.neo4j.com/t/unable-to-retrieve-routing-information/44203/14?u=raw

this overcame the error. (let's see if i can use it now...)

for the llamaindex team - you're going to get a lot of people hitting this wall, so i would recommend adding this implementation note to the docs.

tomasonjo commented 1 week ago

I asked to add some docs to aura, llamaindex is probably not the right place for this