Open Beet-Farms opened 2 months ago
Can you check if there's nodes with original logic? To discard if the error is in NodeParser
or reading JSON file.
Thanks for your response. Yes, I can see nodes in qdrant
when using default SentenceWindowNodeParser
. I checked MarkdownNodeParser
which works fine. Looks like there is a problem when attempting to use JSONNodeParser
only.
To enable MarkdownNodeParser
, I followed similar steps that I attempted for 'JSONNodeParser`.
In ingest_service.py
:
from llama_index.core.node_parser import SentenceWindowNodeParser
with from llama_index.core.node_parser import MarkdownNodeParser
node_parser = SentenceWindowNodeParser.from_defaults()
with node_parser = MarkdownNodeParser.from_defaults()
Would love to know if someone succeeded in using JSONNodeParser
.
Question
I’m currently using PrivateGPT v0.6.1 with
Llama-CPP
support on a Windows machine withqdrant
DB. LLM used isMistral-7B-Instruct-v0.3
and embedding model isBAAI/bge-m3
.I have a situation where I need to ingest a large JSON file - say a telephone directory, where each record should remain intact as a single node. When using the SentenceWindowNodeParser, the records often split at improper places, leading to jumbled responses when querying the LLM, especially when it comes to matching users to their telephone numbers.
I made the following changes to
ingest_service.py
from llama_index.core.node_parser import SentenceWindowNodeParser
withfrom llama_index.core.node_parser import JSONNodeParser
node_parser = SentenceWindowNodeParser.from_defaults()
withnode_parser = JSONNodeParser.from_defaults()
After making these changes, I tried ingesting the JSON file again. It didn’t throw any errors, but the console showed that the file was converted into 1 document, with a message saying:
private_gpt.components.ingest.ingest_component - Inserting count=0 nodes in the index
. As expected, I don't see any nodes in Qdrant.What am I missing? Your advice would be greatly appreciated!