issues
search
jzhang38
/
TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.29k
stars
423
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
For pretraining, does it inlcude Block Causal Attention and Block Diagonal Mask?
#192
Leo-T-Zang
opened
2 days ago
0
lightning run model
#191
icemoon-creative
opened
1 week ago
1
有没有chat模型的其余benchmark的报道,例如mt-bench?
#190
Luoqiu76
opened
3 weeks ago
0
Clarify Chinese support or not on README
#189
fangzhangmnm
closed
3 weeks ago
1
model.py
#188
daxian-lh
opened
1 month ago
1
model结构
#187
daxian-lh
closed
3 weeks ago
1
Llama 3
#186
cduk
opened
1 month ago
1
Training Run - New Tokenizer
#185
dustinwloring1988
opened
1 month ago
1
On which will it run better
#184
Sridharprakash
opened
1 month ago
0
Would it be possible to provide help with evaluation?
#183
secretyjc
opened
1 month ago
0
Aws a100 example
#182
A-Kokolis
closed
2 months ago
0
Where is the pretraing example of llama-1.1b-chat
#181
aritralegndery
opened
2 months ago
0
A potential bug in multi-GPU training
#180
zyushun
closed
1 month ago
1
Encountered an issue while loading the model using transformers
#179
Yukang-Lin
opened
2 months ago
1
模型和代码欢迎发布到wisemodel.cn开源社区
#178
LiuDQ-wm
opened
2 months ago
0
The results under the FastChat framework are quite bizarre?
#177
Felixvillas
opened
2 months ago
0
More intermediate checkpoints in < 240k steps
#176
MaveriQ
opened
2 months ago
0
Is there any simple demo of fine-tuning TinyLlama
#175
Bill-Cai
closed
2 months ago
4
On the visualization of Wandb in fine-tuning
#174
mli-tian
closed
2 months ago
0
Why FSDP not DPP?
#173
noforit
opened
3 months ago
0
A question on learning rate decay schedule
#172
zyushun
closed
3 months ago
1
Pretraining failing on IndexError: list index out of range in file packed_dataset.py
#171
databillm
closed
3 months ago
1
Resolve circular dependency and import issues
#170
keeeeenw
closed
3 months ago
0
Should this line use args.seed instead of seed=42?
#169
brynhayder
opened
3 months ago
0
Help me pls
#168
aritralegndery
opened
3 months ago
2
Encountered an issue while loading the model using transformers
#167
luiluizi
opened
3 months ago
1
Please provide prompt guide for tinyllama-1.1b-chat-v1.0 ?
#166
anuragvohraec
closed
3 months ago
1
Reference for pretraining other small language models
#165
kmn1024
opened
3 months ago
1
revise continue train from initial_iter
#164
peiji1981
opened
4 months ago
1
Pre training Continue from TinyLlama-1.1B-intermediate-step-1431k-3T
#163
dshah-inspird-dev
closed
3 months ago
1
revise dataloader for continue training
#162
peiji1981
closed
4 months ago
1
The pre-training process crashed after a few iter
#161
Nero0113
closed
4 months ago
1
How to compute metrics like ROUGE, BLEU in sft script?
#160
dopu2k16
opened
4 months ago
1
proper way to pad prompts
#159
HassanJbara
closed
3 months ago
2
Is there any function calling model for tinyllama?
#158
atregret
opened
4 months ago
1
Added a gradio demo
#157
smukherjee1do
closed
4 months ago
3
Added a app.py file
#156
smukherjee1do
closed
4 months ago
2
The roadmap has been sitting for a while.
#155
bethelangela
opened
4 months ago
1
Hi, how can I finetune tinyllama with a custom dataset as follows?
#154
oliverbob
opened
4 months ago
2
fp16 finetune will loss=0
#153
sankexin
opened
4 months ago
0
Activation Checkpointing
#152
syncdoth
closed
4 months ago
2
TPU Pretraining
#151
kathir-ks
closed
3 days ago
0
Update README_CN
#150
koalazf99
closed
5 months ago
1
Convert weights to original llama weights.
#149
PSanni
closed
4 months ago
2
Hello, is this tokenizer using LLaMA’s Tokenizer or did you train it yourself?
#148
chenhk-chn
closed
4 months ago
1
about the speed
#147
wangyi-fudan
opened
5 months ago
1
Taking a few days to complete SlimPajama "Train" data
#146
Ahmedhasssan
closed
4 months ago
2
[Question] Is pre-training with FP32 possible?
#145
veritas9872
closed
4 months ago
2
Can checkpoints in the lit_gpt configuration format be open sourced?
#144
haiduo
closed
5 months ago
0
Unable to pretrain: tokenizer raises NotImplementedError
#143
zxti
closed
4 months ago
3
Next