brainqub3 / jar3d_meta_expert

Versatile agents for long running, research intensive tasks.
MIT License
371 stars 115 forks source link

How to see Neo4j in action? + getting some pdf errors #28

Closed garkpit closed 1 month ago

garkpit commented 3 months ago

Always exciting when you have a new release! Signed up for Neo4j and hoping to see it in action. Any reason you didn't use GraphRag?

I'll try a different prompt later today, but here's my first attempt:

Here's my prompt:

Give me a plan for a 13 hour layover in Bangkok on a Saturday night. Don't worry about luggage (it's routed onward). Find good authentic food for dinner. Make sure we see the best attractions. End the night, last 3 hours, somewhere relaxing like a spa or rooftop bar. Arrival time is 1715. Departing flight is at 0615. Give me details including addresses, web links, opening hours, reviews, and so on. Prefer authentic cultural things rather than modern things.

and then /end

Logs show: ... DEBUG HYBRID VALUE: None

\Initiating Retrieval...

Running Dense Only Retrieval... ... so I assume Neo4j didn't get involved (nothing on the web workspace)

Here are the pdf errors (collection of unique ones):

1/ 2 of these - both seem to be associated with a web request getting 403 DEBUG 2024-08-28 09:51:57 - https://www.welcomepickups.com:443 "GET /bangkok/taxi/ HTTP/11" 403 0 Error in LLM Sherpa LayoutPDFReader: cannot access local variable 'pdf_file' where it is not associated with a value Traceback (most recent call last): File "/Users/Jay/Dev/AI/brainqub3/meta_expert/tools/offline_graph_rag_tool.py", line 255, in intelligent_chunking doc = reader.read_pdf(url) ^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 65, in read_pdf pdf_file = self._download_pdf(path_or_url) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 41, in _download_pdf return pdf_file ^^^^^^^^ UnboundLocalError: cannot access local variable 'pdf_file' where it is not associated with a value No document to append to corpus

2/ once: DEBUG 2024-08-28 09:51:58 - https://www.flyertalk.com:443 "GET /forum/thailand/1879502-practical-tips-12-hour-layover-bangkok-bkk.html HTTP/11" 200 None 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 DEBUG 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 DEBUG 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 DEBUG 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 DEBUG 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 Error in LLM Sherpa LayoutPDFReader: 'return_dict' Traceback (most recent call last): File "/Users/Jay/Dev/AI/brainqub3/meta_expert/tools/offline_graph_rag_tool.py", line 255, in intelligent_chunking doc = reader.read_pdf(url) ^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 73, in read_pdf blocks = response_json['return_dict']['result']['blocks']


KeyError: 'return_dict'
No document to append to corpus

Many thanks for this! It's super interesting :)
john-adeojo commented 2 months ago

Thanks the support! Regarding your bug, are you sure you have the LLM Sherpa server running?

You first have to build the server image with Docker.

   docker pull jamesmtc/nlm-ingestor:latest

Then you have to run it:

   docker run -p 5010:5001 jamesmtc/nlm-ingestor:latest
john-adeojo commented 2 months ago

The most recent commit uses Docker to get up and running with Jar3d. The approach should cause less errors, please give it a try and let me know how it goes.

john-adeojo commented 1 month ago

Closing this because I have deprecated the Neo4j feature.