yamadashy / repomix

📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools like Claude, ChatGPT, and Gemini.
MIT License
4.36k stars 202 forks source link

can't pack a repo due to presence of special token: <|endoftext|> #89

Closed Shubxam closed 1 month ago

Shubxam commented 1 month ago

working with a repo which contains hf model. tried to ignore folders which might cause the problem, but still unable to pack.

Screenshot 2024-09-28 at 19 50 58
yamadashy commented 1 month ago

Thank you for reporting this issue, @Shubxam!

It seems the error is indeed related to token counting, specifically with the special token <|endoftext|> that is commonly used in some NLP models.

I've addressed this in our latest release, version 0.1.39. In this update, i've changed how Repopack handles token counting errors

https://github.com/yamadashy/repopack/releases/tag/v0.1.39

This should allow Repopack to complete the packing process even when encountering these special tokens.

yamadashy commented 1 month ago

@Shubxam Hi there! Just checking in about the special token issue. We've released v0.1.39 which should address this problem. Could you please try it out and let us know if it resolves your issue? Thanks!

Shubxam commented 1 month ago

yes I can indeed confirm that the issue has been fixed and I can pack the repo without any error. thanks @yamadashy

yamadashy commented 1 month ago

@Shubxam Thank you for confirming! I'm glad to hear that the issue has been resolved.

I'll go ahead and close this issue now. If you encounter any other issues or have any suggestions in the future, please don't hesitate to open a new issue.

Thanks again for your contribution!