Goal of the project is to be the most deterministic and precise code context provider for any code repository and across multiple such code repositories tied through domain and then eventually become the unified code context provider which then can be integrated with projects like OpenDevin, Devon, Danswer, Continue Dev and other oss , thereby complimenting the precision of these frameworks with minimal opex.
Task | Status |
---|---|
Launch autodoc for Java | Ready: Stable |
Launch autodoc for Kotlin | Planned |
Launch autodoc for Python | Ready: Stable |
Launch autodoc for all programming languages | Planned |
Launch Graph based ingestion and query through Dspy | Planned |
Launch Tree sitters for respective programming languages to induce changes in Codebases to adjust knowledgeGraph based on commit and git diff | Planned |
Solve cross cutting concerns and make it possible to self host | Planned |
Plug it in to oss partners/heros mentioned below. | Planned |
Note: Full codebase parsing is based on https://chapi.phodal.com/ and contributions are required on chapi upstream to improve parsing metadata. . for detailed issues please refer roadmap mentioned below.
Here's a visual representation of the process using a Mermaid diagram:
graph LR
A[Start] --> B[Index Code Files]
B --> C[Process Query]
C --> D[Retrieve Relevant Data]
D --> E[Rerank Results]
E --> F[Present Results]
F --> G[End]
This diagram helps visualize the workflow from the start of the query to the presentation of results, illustrating the steps where inefficiencies and complexities arise.
The Unoplat approach offers a significant shift from the conventional AI-powered tools by opting for a deterministic method to manage and understand codebases. Here’s an overview of how Unoplat proposes to resolve the inefficiencies of current AI-powered code assistance tools:
Here’s a visual representation using a Mermaid diagram to illustrate the Unoplat process:
graph TD
A[Start] --> B[Language-Agnostic Parsing]
B --> C[Generate Semi-Structured JSON]
C --> D[Enhance Metadata]
D --> E[Integrate with Dspy Pipelines]
E --> F[Generate Enhanced Code Atlas]
F --> G[End]
This diagram outlines the Unoplat process from the initial parsing of the codebase to the generation of an enhanced Code Atlas, highlighting the deterministic and structured approach to managing and understanding codebases.
Local workspace on your computer from:
https://github.com/DataStax-Examples/spring-data-starter.git
Note: Python support is alpha right now - check issue- Python-Improvements
Python is stable and we tested on the most latest groundbreaking optimiser for prompts - textgrad through a model that has not seen beyond 2021. We have got amazing results. please check output below.
Local workspace on your computer from:
https://github.com/zou-group/textgrad
Refer - class and function level summaries for python as it is in alpha right now. Package and code are not up to the mark.
Local workspace on your computer from:
https://github.com/stanfordnlp/dspy/tree/main/dspy
pipx install git+https://github.com/unoplat/unoplat-code-confluence.git@v0.9.0#subdirectory=unoplat-code-confluence
{
"local_workspace_path": "your path to codebase",
"output_path": "directory path for markdown output",
"output_file_name": "name of markdown output (example - xyz.md)",
"codebase_name": "name of your codebase",
"programming_language": "programming language type(example- java or python)",
"repo": {
"download_url": "archguard/archguard",
"download_directory": "download directory for arcguard tool"
},
"api_tokens": {
"github_token": "your github pat for downloading arcguard"
},
"llm_provider_config": {
"openai": {
"api_key": "YourApiKey",
"model": "gpt-3.5-turbo-16k",
"model_type" : "chat",
"max_tokens": 1024,
"temperature": 0.0
}
}
Configuration Note: Do not change the download_url and keep the programming_language to java or python (as right now only java & python are supported)
llm Provider Config:
Model Providers Supported: ["openai","together","anyscale","awsanthropic","cohere","ollama"]
For config inside llm_provider_config refer - Dspy Model Provider Doc
Use Chat models. We have not tested instruct models as of now.
If you are looking for some credits sign up on Together AI and get 25$ to run code confluence on repository of your choice. You can even use Ollama
Together Example:
"llm_provider_config": {
"together": {
"api_key": "YourApiKey",
"model": "zero-one-ai/Yi-34B-Chat"
}
Ollama Example:
"llm_provider_config": {
"ollama": {
"model": "llama3"
}
Note: we have only tried gpt3.5 turbo and it works well on codebases.
Also this will get much better as currently all the dspy modules are uncompiled.We will be rolling out evaluated models and results post optimisation soon. Until then users can use 3.5turbo for decent results.
unoplat-code-confluence --config example_config.json
Please let us know if there are issues other than limitations that are hindering your adoption. we will prioritise accordingly. Also please go through issues first before raising any issue.
These are the people because of which this work has been possible. Unoplat code confluence would not exist without them.
Book a call with us - Cal Link
UnoplatCodeConfluence Discord Channel