Closed garkpit closed 1 month ago
Thanks the support! Regarding your bug, are you sure you have the LLM Sherpa server running?
You first have to build the server image with Docker.
docker pull jamesmtc/nlm-ingestor:latest
Then you have to run it:
docker run -p 5010:5001 jamesmtc/nlm-ingestor:latest
The most recent commit uses Docker to get up and running with Jar3d. The approach should cause less errors, please give it a try and let me know how it goes.
Closing this because I have deprecated the Neo4j feature.
Always exciting when you have a new release! Signed up for Neo4j and hoping to see it in action. Any reason you didn't use GraphRag?
I'll try a different prompt later today, but here's my first attempt:
Here's my prompt:
Give me a plan for a 13 hour layover in Bangkok on a Saturday night. Don't worry about luggage (it's routed onward). Find good authentic food for dinner. Make sure we see the best attractions. End the night, last 3 hours, somewhere relaxing like a spa or rooftop bar. Arrival time is 1715. Departing flight is at 0615. Give me details including addresses, web links, opening hours, reviews, and so on. Prefer authentic cultural things rather than modern things.
and then /end
Logs show: ... DEBUG HYBRID VALUE: None
\Initiating Retrieval...
Running Dense Only Retrieval... ... so I assume Neo4j didn't get involved (nothing on the web workspace)
Here are the pdf errors (collection of unique ones):
1/ 2 of these - both seem to be associated with a web request getting 403 DEBUG 2024-08-28 09:51:57 - https://www.welcomepickups.com:443 "GET /bangkok/taxi/ HTTP/11" 403 0 Error in LLM Sherpa LayoutPDFReader: cannot access local variable 'pdf_file' where it is not associated with a value Traceback (most recent call last): File "/Users/Jay/Dev/AI/brainqub3/meta_expert/tools/offline_graph_rag_tool.py", line 255, in intelligent_chunking doc = reader.read_pdf(url) ^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 65, in read_pdf pdf_file = self._download_pdf(path_or_url) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 41, in _download_pdf return pdf_file ^^^^^^^^ UnboundLocalError: cannot access local variable 'pdf_file' where it is not associated with a value No document to append to corpus
2/ once: DEBUG 2024-08-28 09:51:58 - https://www.flyertalk.com:443 "GET /forum/thailand/1879502-practical-tips-12-hour-layover-bangkok-bkk.html HTTP/11" 200 None 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 DEBUG 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 DEBUG 2024-08-28 09:51:58 - Starting new HTTP connection (1): localhost:5010 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 DEBUG 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 DEBUG 2024-08-28 09:51:58 - http://localhost:5010 "POST /api/parseDocument?renderFormat=all&useNewIndentParser=yes HTTP/11" 500 0 Error in LLM Sherpa LayoutPDFReader: 'return_dict' Traceback (most recent call last): File "/Users/Jay/Dev/AI/brainqub3/meta_expert/tools/offline_graph_rag_tool.py", line 255, in intelligent_chunking doc = reader.read_pdf(url) ^^^^^^^^^^^^^^^^^^^^ File "/Users/Jay/Dev/AI/brainqub3/meta_expert/.conda/lib/python3.11/site-packages/llmsherpa/readers/file_reader.py", line 73, in read_pdf blocks = response_json['return_dict']['result']['blocks']