kingjulio8238 / Memary

The Open Source Memory Layer For Autonomous Agents
https://finetune.dev/
MIT License
1.48k stars 101 forks source link

memaryParse #44

Open kingjulio8238 opened 5 months ago

kingjulio8238 commented 5 months ago

memary currently parses the agents' responses, which are stored in a .txt file, before inserting them into our knowledge graphs.

As we look to support agentic systems running real-world tasks, our memory unit needs to allow the system's maintainer to pre-process the knowledge graph with relevant data. For example, an e-commerce company wants to upload their users' information so that the agent can initially respond with context.

Companies may present this data in various file formats, such as .csv, .pdf, .txt, .pptx, or others. That is why memary must support many configurable parsers under a parent parser - memaryParse. For example, a company running an agent with data in .csv and .docx files can configure a parent retriever that supports both formats to pre-process the data into the knowledge graph before running their agents using memary.

We expect memaryParse to expand over time. Initially, we hope to support the following formats:

memaryParse should also support the following result types: TXT, MD, and JSON (we will look to add others in the future).

Resource for inspiration: https://github.com/run-llama/llama_parse/blob/main/llama_parse/utils.py

rawwerks commented 5 months ago

for PDFs, please just let me put my llamaindex API key to use llamaparse. they have worked so hard on this, i would strongly advise not to re-invent it.

for other datatypes (and for getting the text from llamaparse into a KG), here are some resources. neo4j has done a lot of work on this.