-
Would be very usefull to add open source models like Llama3
-
What are the memory footprints (GB) of
- Llama-3.1-8B
- Llama-3.1-70B
- Llama-3.1-405B
- Llama-3-8B
- Llama-3-70B
models and hardware specifications required to run the models?
-
Please add support for the latest Meta Models
https://ai.meta.com/blog/meta-llama-3-1/
-
Going to give this a week to settle, there's always bugs when quants first land.
-
I've been waiting for a while, haha..
Today's the Llama3.1 405B model is officially announced https://llama.meta.com/. Do we have any plan to support this soon? I assume we need to adjust Eagle mod…
-
### Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…
-
-
Hello again, if you plan on supporting LLama3.1 please note that it requires a new category of ROPE scaling. Thanks!
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/blob/main/config.j…
-
Given that we have only Llama 3 70B and 8B, it would be useful to have a Tiny Llama based on the Llama 3 tokenizer so that we can use it as a drafting model for speculative decoding.
Are there pla…
-
Tried to make an EXL2 of it.
I added the fix to the `inv_freq` scaling that is apparently expected in these models, making the following change in `model.py` (see https://huggingface.co/v2ray/Llama…