redpajama Search Results

514 results
for redpajama

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Lightning-AI/lit-llama #327

Less is more for alignment (LIMA) - adding special EOT tok…

Hi, any help or guidance on how to add a special EOT token, as described in the [LIMA](https://arxiv.org/abs/2305.11206) paper by Meta? More specifically, in section 3, Training LIMA, they d…

kperi updated 1 year ago
9
OpenGVLab/EfficientQAT #23

Is it possible to run e2e-qp process on a single 4090?

hi, thanks for your great work! As reported in your paper, memory requirement for Llama-7B with 4 bits is 7GB. However, on a single 4090, when I run `bash examples/e2e_qp/Llama-2-7b/w4g-1-redpajama.s…

sihouzi21c updated 3 weeks ago
2
Topology1225/news #7

20230903

# 大規模言語モデルのサーベイ https://docs.google.com/presentation/d/178Nk6flxqS59E2J9SSj5ZhyZ8YD_3tWJ/edit?usp=sharing&ouid=106680143492150558607&rtpof=true&sd=true LLMの概要を網羅したテキスト LoRAの仕組みに言及しており、詳しく知りたいので…

Topology1225 updated 1 year ago
2
lfai/model_openness_tool #37

GitHub action update_models fails to add new models

The job that ran when PR #36 was merged somehow failed to add the new models that were added as part of this PR. https://github.com/lfai/model_openness_tool/actions/runs/11442783083/job/31833926598 …

lehors updated 3 weeks ago
1
containers/ai-lab-recipes #547

llama-cpp-server broken

Got this while running from main branch in Podman AI Lab: ``` �llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from /granite-7b-lab-Q4_K_M.gguf (version GGUF V3 (lates…

jeffmaury updated 5 months ago
7
ggerganov/ggml #147

[RFC] Implement a mechanism to detect the type of model bein…

With all the variant of ML model out now - gpt2/gptneox/llama/gptj, I wonder if theres a way to infer the model's type from reading it?... Right now, if someone gives me a random model file with ob…

louisgv updated 1 year ago
7
apple/ml-sigmoid-attention #4

reproducing Language Modeling results

Hi! Thank you for releasing the code. In the [paper](https://arxiv.org/pdf/2409.04431) you report training Llama2 recipe on 300M tokens of RedPajama dataset. However, in your code I only found exampl…

Golovneva updated 1 month ago
2
jungwoo-ha/WeeklyArxivTalk #85

[20230611] Weekly AI ArXiv 만담 시즌2 - 19회차

### News - Conferences - [MSRA 를 캐나다 밴쿠버로?](https://n.news.naver.com/mnews/article/014/0005025017?sid=101) - [Microsoft unveils Azure OpenAI Service for government & AI customer commitments](https:…

jungwoo-ha updated 1 year ago
2
openlm-research/open_llama #11

Any plans to train for 30b model

Are there any plans to train a 30b replica of Llama or is the 7b enough to meet your purposes of comparison?

mtc2013 updated 1 year ago
5
huggingface/datasets #6734

Tokenization slows towards end of dataset

### Describe the bug Mapped tokenization slows down substantially towards end of dataset. train set started off very slow, caught up to 20k then tapered off til the end. what's particularly s…

ethansmith2000 updated 7 months ago
3

上一页 1...6 7 8 9 10 11 12...52 下一页

514 results for redpajama

514 results
for redpajama