what if i want it to work on existing codebase and not from scratch ? - Githubissues

theyashwanthsai / Devyan

Building a Software Dev Team. Experimental Project to orchestrate a team of agents to solve programming tasks.

155 stars 34 forks source link

what if i want it to work on existing codebase and not from scratch ? #5

Open hemangjoshi37a opened 1 month ago

hemangjoshi37a commented 1 month ago

I want it to work on my existing project with multiple code files and with nested folders and multimodality with local models like ollama and lite-llm

theyashwanthsai commented 1 month ago

Yup, that is the next step. Basically "build your own Intern". It will take some time building tools and logic, but it sure is possible.

Right now Devyan is in early stage where We get applications from text promts

hemangjoshi37a commented 1 month ago

Please let me know how can i help in this direction . Like you can suggest any edit to a file or something , i will update and PR it.

theyashwanthsai commented 1 month ago

I see.

I will let you know. I myself dont have a rough idea on how to implement. but i will let you know what can be done

hemangjoshi37a commented 1 month ago

ok. Can we use graphrag : https://github.com/microsoft/graphrag for context generation using vector search ? or how can we feed inital context or base context into this system ? If we can feed base knowledge at the initial stage then we can achieve this .

theyashwanthsai commented 1 month ago

We should also make tools which can edit a file instead of rewriting. this will be tricky for the agent to edit in a file.

hemangjoshi37a commented 1 month ago

it is simple , we can use prompt something like this :

you always response in git blame format with `-` and `+`  at the beginning of the line with few upper and lower lines for the reference . also give line numbers where edits are made.

using this we can parse edits from the response and replace it with existing code and run git commit with generated commit message .

theyashwanthsai commented 1 month ago

I see, But the problem with that approach is that we might hit token limits if the context from one task is too much. But can definitely give this approach a try

theyashwanthsai commented 1 month ago

We can do some tricks here with that approach, Works perfectly

hemangjoshi37a commented 1 month ago

if we can decode what cursor.sh is doing then we can do this. if you know cursor IDE , it is a AI coding IDE based on VS Code , it is very good to use but one only and big problem is it is closed source and i dont know how it accesses information across different files . may be it uses some sophisticated RAG system specially designed code coding text.

theyashwanthsai commented 1 month ago

Yup. Theres also another approach where we programmatically create code blocks (Grouping snippets). And the operations can be done on these blocks. Not sure how good this idea might be in practical case wrt costs

hemangjoshi37a commented 1 month ago

please review #6 for this

jojogh commented 1 month ago

Yup. Theres also another approach where we programmatically create code blocks (Grouping snippets). And the operations can be done on these blocks. Not sure how good this idea might be in practical case wrt costs

Use AST to find codebase dependancy and represent it to LLM(use as a context in prompt), and focus on the code snippets you want to update or write. That may solve the long context token problem.