neo4j-labs / neodash

NeoDash - a Dashboard Builder for Neo4j
https://neo4j.com/labs/neodash/
Apache License 2.0
427 stars 145 forks source link

Caching or sampling of schema queries in QueryTranslator #578

Open l47y opened 1 year ago

l47y commented 1 year ago

Firstly, thanks a lot for the neodash tool in general, and especially for the new query translator feature. We have been testing this for several databases and made very good experience with it.

Request Summary

Using the query translator feature on a very large graph database seems to not work very well. I am currently working with a database of 100M nodes and 200M relationships. As I understand from the code, the prompt that is sent to the LLM provider contains several calls of apoc.meta.data() (here). In our case, each of these calls would take almost a minute in the neo4j Browser. When using the query translator in neodash for this database, the following error pops up:

Error when translating the naturual language query: Couldn't generate schema due to: The transaction has been terminated. Retry your operation in a new transaction, and you should see a succesful result. The transaction has not completed within the timeout specified as its start by the client. You may want to retry with a longer timeout.

Describe the solution you'd like I was thinking of:

  1. A setting to use the sample keyword for the apoc.meta.data() calls. In my tests, this can reduce drastically the query time of apoc.meta.data()
  2. Some sort of caching of the apoc.meta.data() query results in the neodash frontend

Describe alternatives you've considered I tried to cache the apoc.meta.data() query result in the neo4j Browser, but without success yet. The only other alternative for me at the moment that I see is to construct the prompt with sampling on my own.

Thanks a lot for your help, and please let me know if any further information needed.

Nicolas

alfredorubin96 commented 12 months ago

Hi @l47y ,we added schema sampling to the new version (2.4.0). With respect of caching the data model, we are still working on it. We hope you will love it and please give us a feedback on the speed of the schema sampling.