Open darxkies opened 2 weeks ago
Good evening!
Thanks for the feedback. It is very welcome. Regarding making the code more idiomatic in Rust, I pushed a commit aimed at that (Clippy should not give any warnings now).
This project started as a weekend project and was initially aimed at supporting Gemma models, but I decided to keep adding features to it and support more models. Right now the project has one main goal: being an alternative to similar projects that use very complex libraries with loads of lines, it should be easy to understand the inner workings of the large language models. I agree with you that variable names and functions could be improved for better readability, will change that when possible.
Now, what's next? Well, I don't know, I'm more moved by challenge than anything. And if any model comes out that catches my interest (particularly small ones), I will probably implement it. But for that, the code should be more versatile and less hard-coded in some places.
I would love to support Phi 3.5, especially the vision one, that would be a good challenge.
Other samplers could be implemented, do you have any specific ones?
Thanks again.
A while ago I came across this repository that you might be interested in: https://crates.io/crates/llm-samplers
After refactoring some of the project's code I implemented this which was fun to play with: https://github.com/sam-paech/antislop-sampler/blob/main/antislop_sampler.ipynb
Next, I might implement DRY as a POC—or even XTC.
The overhead of your project is so low that it can easily be used to test new ideas.
First of all, it is a great project. The code is small, clean, and has a lot of potential. It is a great starting point for anyone who knows Rust and wants to learn the inner workings of LLMs. However, like other projects of the same size in other programming languages, it uses a lot of acronyms, which makes the code hard to read. It also throws a lot of clippy warnings. Which brings me to the first question: What is the goal of the project?
If you want to make it more beginner-friendly, here are some suggestions for improving the code:
I would provide a pull request to fix the above issues if you are interested.
Another question is: