ollama / ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
https://ollama.com
MIT License
88.88k stars 6.94k forks source link

An improved Rag System CRag #6081

Open PGTBoos opened 1 month ago

PGTBoos commented 1 month ago

This is quite a bit of coding, but I was playing with AnythingLLM and Olama and noted how both evolve towards each other. And also how Meta / Olama is becoming an opensource king, making new standards, so ideas in some areas could fuse.

As Large Language Models (LLMs) evolve to process and analyze documents more effectively, there's an increasing need for robust Retrieval-Augmented Generation (RAG) systems to support their functionality. While exploring existing RAG implementations, I've identified an opportunity to create a specialized system optimized for code analysis and comprehension. Let's call this new RAG system CRAG (Code Retrieval-Augmented Generation). The fundamental principle behind CRAG is to leverage the inherent structure of well-written code, which typically consists of modular, concise functions and methods rather than monolithic blocks of thousands of lines (though such instances do exist in practice). CRAG System Architecture:

File-Level Information: CRAG stores each code file as a database entry with the following attributes:

Filename Code type (e.g., class, utility, base class, interface) Dependent libraries (imports, using statements, etc.) Subtype (e.g., RxJS, event handling, services, web API) - language and import-dependent Short description (generated by the LLM or extracted from file header comments)

Function/Method-Level Information: For each file, CRAG maintains a list of functions and methods, including:

Function/method name Return type (primitive types like string, bool, or custom types) Order number within the file Starting row index

Large Function Handling: If a function or method exceeds a certain size threshold, CRAG breaks it into multiple entries for easier processing and retrieval. Code Storage: The system stores the raw code for each function/method. While a traditional database could be used, CRAG can optimize storage by maintaining only the necessary indices and file navigation information, allowing the LLM to read the code directly when needed. Code Navigation and Dependency Tracking: CRAG is designed to follow code execution paths and dependencies. When a user poses a question, the system reads the relevant code and its dependencies, storing connected/related code parts in a similar database-like structure. Efficient Storage of Related Code: To minimize storage requirements, CRAG primarily stores indices of related code rather than duplicating the full content. This approach maintains a compact structure while enabling the LLM to traverse and analyze code relationships. Depth-Limited Code Exploration: The LLM can explore code relationships to a configurable depth, with the potential to expand this depth as needed for more complex analyses. Language-Specific Optimizations: CRAG can be tailored to support specific programming languages, frameworks, and paradigms, allowing for more accurate parsing and understanding of code structures, ea C++ C# PLC, ASM, basic Java, python etc. Semantic Understanding: In addition to syntactic analysis, CRAG can incorporate semantic understanding of code, allowing the LLM to grasp the intent and functionality of code segments more accurately. Version Control Integration: CRAG can be designed to work with version control systems, enabling analysis of code changes over time and facilitating code review processes. Performance Optimization: The system can include caching mechanisms and indexing strategies to improve retrieval speed for frequently accessed code segments. Security Considerations: CRAG should implement robust security measures to protect sensitive code and prevent unauthorized access or code injection vulnerabilities. based upon a preprompt and the knowledge of the coding language this might go hand in hand.

The system can be designed to learn from user interactions and feedback, improving its code understanding and retrieval capabilities over time. It might weight the relations between functions, and keep a favourite list against what the user is interested in most.

I know its not directly OLAMA, but it might become part of the OLAMA framework I think.

FellowTraveler commented 1 month ago

This is probably not an issue for Ollama, but you should check out R2R and consider submitting a PR there: https://github.com/SciPhi-AI/R2R

PGTBoos commented 1 month ago

Or.. take a look at what agent zero does Its just released Google it on git. Or youtube

Outlook voor Android downloadenhttps://aka.ms/ghei36


From: FellowTraveler @.> Sent: Friday, August 9, 2024 6:29:46 PM To: ollama/ollama @.> Cc: PGTBoos @.>; Author @.> Subject: Re: [ollama/ollama] An improved Rag System CRag (Issue #6081)

This is probably not an issue for Ollama, but you should check our R2R and consider submitting a PR there: https://github.com/SciPhi-AI/R2R

— Reply to this email directly, view it on GitHubhttps://github.com/ollama/ollama/issues/6081#issuecomment-2278321454, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFQQOG3SLSZQGCEMGGBF7XTZQTU7VAVCNFSM6AAAAABLXIHK2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZYGMZDCNBVGQ. You are receiving this because you authored the thread.Message ID: @.***>

FellowTraveler commented 1 month ago

@PGTBoos Ollama is not a rag system, but there are many rag projects (such as R2R) out there which can be used with Ollama doing the inferencing. Can you please close this issue, as this feature is not within the scope of this project.

PGTBoos commented 1 month ago

Is it better to post this at anything llm?. As i notice your worlds or eco systems are getting closer and closer probably the same umbrella of investors. I think both could benefit from those ideas

FellowTraveler commented 1 month ago

I would post it at R2R. Frankly I would use this change myself if R2R supported it.

https://github.com/SciPhi-AI/R2R