Open rdwj opened 2 weeks ago
Second this. For TypeScript projects this would be a game changer. Most of the time, if the project is well-structured, implementation details are not needed. Imagine a clean DDD backend API, how easy it would be to generate full endpoints using that approach.
@rdwj @scriptify Thank you for your interesting proposal!
You raise a very good point. Currently, Repomix struggles with large-scale projects, mainly due to token limits. Token reduction techniques like this could be a great way to address this limitation.
For implementing this feature, I'm thinking about two possible approaches:
Start with TypeScript-specific implementation as an experiment
Use a language-agnostic approach with tools like Tree-sitter
However, I'm still exploring the best way to implement this effectively.
Please share your thoughts and experiences! Together we can find a good approach to make Repomix more useful for large projects.
I agree - TypeScript is a great place to start.
It would be a good idea to have a recommended set of content for the repomix-instruction.md to guide the LLM in including (and preserving) great documentation each time.
I'll work on a Python context reducer for this and maybe to a PR if I get it to work locally
On Sat, Nov 16, 2024 at 8:42 AM Kazuki Yamada @.***> wrote:
@rdwj https://github.com/rdwj @scriptify https://github.com/scriptify Thank you for your interesting proposal!
You raise a very good point. Currently, Repomix struggles with large-scale projects, mainly due to token limits. Token reduction techniques like this could be a great way to address this limitation.
For implementing this feature, I'm thinking about two possible approaches:
1.
Start with TypeScript-specific implementation as an experiment
Good for validating the concept
- But could lead to maintenance challenges if we add more languages 2.
Use a language-agnostic approach with tools like Tree-sitter
- More complex initial implementation
- But provides unified parsing across many languages
- More sustainable in the long term
However, I'm still exploring the best way to implement this effectively.
Please share your thoughts and experiences! Together we can find a good approach to make Repomix more useful for large projects.
— Reply to this email directly, view it on GitHub https://github.com/yamadashy/repomix/issues/164#issuecomment-2480599953, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXWCW6G5JATJTJPT2GIC5PL2A5KVNAVCNFSM6AAAAABRJU63S2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBQGU4TSOJVGM . You are receiving this because you were mentioned.Message ID: @.***>
That's great! 👍
Thank you for offering to work on a Python reducer! Having multiple language implementations will help us understand what works best.
One concern is that adding Python as a dependency could make installation more challenging. But we could make it an optional feature. That said, we seem to have quite a few Python users already, so this might be less of an issue than I initially thought.
If your project is sufficiently well documented, having a block of text describing the intent, intput and example outputs for each function, then would it be useful to have an option to only include the method signatures and associated documentation, and not the full code?
This might reduce the context a bit and still give the LLM what it actually needs to write the new item you are asking it for.