issues
search
jzhang38
/
TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.71k
stars
453
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to finetune on custom dataset
#48
hrsmanian
closed
11 months ago
11
Why train three epochs? not one epoch?
#47
PeiqinSun
closed
1 year ago
1
Any plans for the ONNX runtime?
#46
VatsaDev
closed
1 year ago
3
Update README_zh-CN.md
#45
ChaosCodes
closed
1 year ago
0
下载数据集不便
#44
scorpjr1
closed
1 year ago
3
Update prepare_slimpajama.py
#43
michael-c-max
closed
8 months ago
0
How to compute token numbers for a dataset?
#42
Arcmoon-Hu
closed
11 months ago
4
Why is the vocab size of `TinyLlama-1.1B-Chat-V0.1` 32001?
#41
Chillee
closed
1 year ago
5
Question Regarding the Absence of BOS and EOS Tokens in Tokenizer Encoding
#40
dtxwhzw
closed
1 year ago
3
Release format + artefact
#39
PierreColombo
closed
1 year ago
3
Update EVAL.md
#38
jzhang38
closed
1 year ago
0
Why is Swiglu packed_weights = False?
#37
larrylawl
closed
1 year ago
1
Resuming training
#36
artnoage
opened
1 year ago
8
TinyLlama-1.1B-orca-gpt4
#35
acalatrava
closed
1 year ago
1
info when load model
#34
shyoulala
closed
1 year ago
3
How did you determine the size of the TinyLlama model?
#33
dtxwhzw
closed
1 year ago
2
eval loss become nan after a single batch
#32
ThibaultCastells
closed
1 year ago
4
Request: Finetune the Model on more Data?
#31
VatsaDev
closed
1 year ago
1
Working Chat Demo
#30
VatsaDev
closed
8 months ago
8
TinyLlama-chat outputs truncated/small?
#29
VatsaDev
closed
1 year ago
1
Fix the finetune directory link in README.md
#28
VatsaDev
closed
1 year ago
1
Minimum learning rate
#27
artnoage
closed
1 year ago
1
国内模型镜像
#26
Ma-Yongqiang
closed
1 year ago
9
Why does a dimension mismatch occur when I use AutoModelForCausalLM to load a model?
#25
BaenRH
closed
1 year ago
2
Getting gibberish output when running on llama.cpp
#24
luungoc2005
closed
1 year ago
33
Why is tokenizer.model_max_length set to 1000000000000000019884624838656?
#23
kevinhu
closed
1 year ago
2
A guide to adding more datasets
#22
VatsaDev
closed
1 year ago
3
Has anyone used this code base for incremental pretraining of llama-2-7b?
#21
s1ghhh
closed
1 year ago
4
我想要使用这个模型
#20
ChuXNobody
closed
11 months ago
17
where is "rotary_emb"?
#19
ScottishFold007
closed
1 year ago
1
Credit to the FlashAttention repo
#18
tridao
closed
1 year ago
1
Are there any provided 4bit quant weights, or like a colab detailing quantization?
#17
VatsaDev
closed
1 year ago
2
Simple WebUI for the project
#16
VatsaDev
closed
1 year ago
5
Cleaned Notebook
#15
VatsaDev
closed
1 year ago
2
Can it run on CPU?
#14
abdul-jabbar-ms
closed
1 year ago
10
How to train model with databricks-dolly-15k.jsonl dataset format.
#13
TapendraBaduwal
closed
1 year ago
4
Have you considered code llama?
#12
IgorTodorovskiIBM
closed
1 year ago
7
Would this be possible to finetune on a weaker gpu like a t4?
#11
VatsaDev
closed
1 year ago
6
How do you plan on dealing with hallucinations due to knowledge compression?
#10
VatsaDev
opened
1 year ago
16
Add support for VEDV (https://github.com/yunielrc/vedv)
#9
yunielrc
closed
1 year ago
0
More eval results
#8
GeneZC
closed
1 year ago
2
Where is RLHF?
#7
MimeZoe0628
closed
1 year ago
1
Colab
#6
rarhs
closed
1 year ago
1
Where dose the rotary_emb import come from?
#5
nmharmon8
closed
1 year ago
2
where is flash attention 2
#4
eyuansu62
closed
1 year ago
1
Hardware requirements
#3
ajinkya123-robo
closed
1 year ago
2
Support Chinese Language?
#2
Johnson-yue
closed
1 year ago
3
fix live tracking link
#1
Green-Sky
closed
1 year ago
0
Previous