-
That'd be even more resource efficient. Thanks!
-
Hello, I am trying to convert TinyLLama 1.1B model (PY007/TinyLlama-1.1B-step-50K-105b model checkpoint name), but getting some sort of shape mismatch error. Kindly look into this @guillaumekln
Er…
-
Thank for you nice work!
I calculate the batch-size use the equation from [scaling-OpenAI](https://arxiv.org/abs/2001.08361), which is 12M tokens if I want achieve a loss ~ 1.8. But I found all paper…
-
### System Info
```Shell
- `Accelerate` version: 0.26.1
- Platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
- Python version: 3.11.5
- Numpy version: 1.26.3
- PyTorch version (GPU?): 2.1.2 …
-
I have enabled llama2.c to run the Tinyllama 1.1B chat on my [repo](https://github.com/magician-blue/llama2.c).
Read [Tiny Llama 1.1B model](https://github.com/magician-blue/llama2.c#tiny-llama-11b-…
-
Hi all,
Thanks for your great work.
I am wondering the training subset of this chinchilla-optimal model. ->这个速度可以让你可以在8个A100上用32小时训练一个chinchilla-optimial的模型(11亿参数,220亿token)
Is this part from sl…
-
Is this line correct?
```
sample['prompt'] = [tokenizer.apply_chat_template([{'role': 'user', 'content': item[0]}], tokenize=False, add_generation_prompt=True) for item in sample['chosen']]
…
-
Windows 11 (24 core/32 processor) (nov 2023, 6MHZ processor) , 64 GIG ram, Nvidia 16 GB card (GEforce RTX 4060TI ) , version LLAMA.CPP mar 31 2024.
I have noticed some anomalies after testing close…
-
### Describe the issue as clearly as possible:
Generating using the examples given on the front page of the repo all generate the same error:
`RuntimeError: Index put requires the source and des…
-
In tinyllama, the dataset is combine of Slimpajama & Starcoderdata, the total tokens in dataset is around 950B tokens;
My question is: what's the meaning of `Sampled all code from Starcoderdata`? I…