-
Hi,
any help or guidance on how to add a special EOT token, as described in the [LIMA](https://arxiv.org/abs/2305.11206) paper by Meta?
More specifically, in section 3, Training LIMA, they d…
-
hi, thanks for your great work! As reported in your paper, memory requirement for Llama-7B with 4 bits is 7GB. However, on a single 4090, when I run `bash examples/e2e_qp/Llama-2-7b/w4g-1-redpajama.s…
-
# 大規模言語モデルのサーベイ
https://docs.google.com/presentation/d/178Nk6flxqS59E2J9SSj5ZhyZ8YD_3tWJ/edit?usp=sharing&ouid=106680143492150558607&rtpof=true&sd=true
LLMの概要を網羅したテキスト
LoRAの仕組みに言及しており、詳しく知りたいので…
-
The job that ran when PR #36 was merged somehow failed to add the new models that were added as part of this PR.
https://github.com/lfai/model_openness_tool/actions/runs/11442783083/job/31833926598
…
-
Got this while running from main branch in Podman AI Lab:
```
�llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from /granite-7b-lab-Q4_K_M.gguf (version GGUF V3 (lates…
-
With all the variant of ML model out now - gpt2/gptneox/llama/gptj, I wonder if theres a way to infer the model's type from reading it?...
Right now, if someone gives me a random model file with ob…
-
Hi! Thank you for releasing the code.
In the [paper](https://arxiv.org/pdf/2409.04431) you report training Llama2 recipe on 300M tokens of RedPajama dataset. However, in your code I only found exampl…
-
### News
- Conferences
- [MSRA 를 캐나다 밴쿠버로?](https://n.news.naver.com/mnews/article/014/0005025017?sid=101)
- [Microsoft unveils Azure OpenAI Service for government & AI customer commitments](https:…
-
Are there any plans to train a 30b replica of Llama or is the 7b enough to meet your purposes of comparison?
-
### Describe the bug
Mapped tokenization slows down substantially towards end of dataset.
train set started off very slow, caught up to 20k then tapered off til the end.
what's particularly s…