issues
search
TencentARC
/
LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.
https://tencentarc.github.io/LLaMA-Pro/
Apache License 2.0
481
stars
35
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
这个方法可以扩展到vit类的视觉encode上吗?
#33
lucasjinreal
opened
2 months ago
0
请教大佬可以训练qewn2-7b吗
#32
jqtian123
opened
3 months ago
0
关于论文中通用能力榜单几乎没有下降,部分反而有提升
#31
bestpredicts
closed
5 months ago
1
关于运行流程
#30
GOOD-N-LCM
opened
5 months ago
4
训练到10B tokens 时loss就收敛了 无法下降
#29
bestpredicts
closed
5 months ago
1
关于零初始化和扩展层的位置
#28
ouyanxi1125
opened
5 months ago
4
finetune_cosmopedia.sh如何训练出来8B模型
#27
RuipingWang1986
opened
6 months ago
1
利用finetune_cosmopedia.sh脚本进行继续预训练中的数据集如何构建
#26
RuipingWang1986
opened
6 months ago
2
Thanks for wonderful projects ! Why I always got the results of apparent loss of original ability?
#25
hzgdeerHo
opened
6 months ago
8
请教下论文中的实验
#24
ChrisXULC
closed
7 months ago
1
Training on arbitary data
#23
HelloWorldLTY
opened
7 months ago
2
Pretrain code of Mistral-Pro-8B-v0.1
#22
shawnricecake
opened
7 months ago
1
Do we need to freeze embedding layer and the lm_head as well during the Llama-pro style training ?
#21
shamanez
closed
7 months ago
2
请教下训练的显存需求
#20
denghj3
opened
8 months ago
6
Comparison with PEFT
#19
LaVieEnRose365
opened
8 months ago
1
更大的模型需要更多的block吗?
#18
PoseidomWong
opened
8 months ago
1
新增的transformer层是与上一层共享参数吗?
#16
CharlinChen
closed
8 months ago
3
llama factory的llama-pro是不是写得不对啊
#15
HuXinjing
closed
9 months ago
2
对比lora优势是什么
#14
xiaozhu1106
opened
9 months ago
1
增量预训练的疑惑?
#13
zhuxiaobin
closed
9 months ago
6
Issue with Model Saving After Layer Expansion: Removed Shared Tensors
#12
yumingfan-0219
closed
9 months ago
2
guide to run the code
#11
Abolfazl-kr
opened
9 months ago
2
您好,请教一下post pretrain的问题
#10
ray075hl
closed
9 months ago
8
Question regarding the difference between llama-pro and the regular llama.(关于llama-pro和普通llama之间的区别的疑问)
#9
WUHU-G
opened
10 months ago
8
How to load the new model weight
#8
khalil-Hennara
opened
10 months ago
1
Should I freeze norm.weight?
#7
metterian
opened
10 months ago
1
full code to continue pre-training
#6
Abolfazl-kr
opened
10 months ago
1
Question about Llama-7B and Llama-7B-Pro comparison.
#5
ryusaeba
opened
10 months ago
2
Arxiv Data
#4
ZhengTang1120
opened
10 months ago
2
我们如何针对扩展区块微调?
#3
win10ogod
opened
10 months ago
5
Code for training llama pro?
#2
yhyu13
opened
10 months ago
8
论文Table7请教
#1
XiaoYee
closed
10 months ago
5