Open kingjulio8238 opened 5 months ago
for PDFs, please just let me put my llamaindex API key to use llamaparse. they have worked so hard on this, i would strongly advise not to re-invent it.
for other datatypes (and for getting the text from llamaparse into a KG), here are some resources. neo4j has done a lot of work on this.
memary currently parses the agents' responses, which are stored in a .txt file, before inserting them into our knowledge graphs.
As we look to support agentic systems running real-world tasks, our memory unit needs to allow the system's maintainer to pre-process the knowledge graph with relevant data. For example, an e-commerce company wants to upload their users' information so that the agent can initially respond with context.
Companies may present this data in various file formats, such as .csv, .pdf, .txt, .pptx, or others. That is why memary must support many configurable parsers under a parent parser - memaryParse. For example, a company running an agent with data in .csv and .docx files can configure a parent retriever that supports both formats to pre-process the data into the knowledge graph before running their agents using memary.
We expect memaryParse to expand over time. Initially, we hope to support the following formats:
memaryParse should also support the following result types: TXT, MD, and JSON (we will look to add others in the future).
Resource for inspiration: https://github.com/run-llama/llama_parse/blob/main/llama_parse/utils.py