TencentARC LLaMA-Pro issues

TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

https://tencentarc.github.io/LLaMA-Pro/

Apache License 2.0

481 stars 35 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

这个方法可以扩展到vit类的视觉encode上吗？

#33 lucasjinreal opened 2 months ago
0
请教大佬可以训练qewn2-7b吗

#32 jqtian123 opened 3 months ago
0
关于论文中通用能力榜单几乎没有下降，部分反而有提升

#31 bestpredicts closed 5 months ago
1
关于运行流程

#30 GOOD-N-LCM opened 5 months ago
4
训练到10B tokens 时loss就收敛了无法下降

#29 bestpredicts closed 5 months ago
1
关于零初始化和扩展层的位置

#28 ouyanxi1125 opened 5 months ago
4
finetune_cosmopedia.sh如何训练出来8B模型

#27 RuipingWang1986 opened 6 months ago
1
利用finetune_cosmopedia.sh脚本进行继续预训练中的数据集如何构建

#26 RuipingWang1986 opened 6 months ago
2
Thanks for wonderful projects ! Why I always got the results of apparent loss of original ability?

#25 hzgdeerHo opened 6 months ago
8
请教下论文中的实验

#24 ChrisXULC closed 7 months ago
1
Training on arbitary data

#23 HelloWorldLTY opened 7 months ago
2
Pretrain code of Mistral-Pro-8B-v0.1

#22 shawnricecake opened 7 months ago
1
Do we need to freeze embedding layer and the lm_head as well during the Llama-pro style training ?

#21 shamanez closed 7 months ago
2
请教下训练的显存需求

#20 denghj3 opened 8 months ago
6
Comparison with PEFT

#19 LaVieEnRose365 opened 8 months ago
1
更大的模型需要更多的block吗？

#18 PoseidomWong opened 8 months ago
1
新增的transformer层是与上一层共享参数吗？

#16 CharlinChen closed 8 months ago
3
llama factory的llama-pro是不是写得不对啊

#15 HuXinjing closed 9 months ago
2
对比lora优势是什么

#14 xiaozhu1106 opened 9 months ago
1
增量预训练的疑惑？

#13 zhuxiaobin closed 9 months ago
6
Issue with Model Saving After Layer Expansion: Removed Shared Tensors

#12 yumingfan-0219 closed 9 months ago
2
guide to run the code

#11 Abolfazl-kr opened 9 months ago
2
您好，请教一下post pretrain的问题

#10 ray075hl closed 9 months ago
8
Question regarding the difference between llama-pro and the regular llama.（关于llama-pro和普通llama之间的区别的疑问）

#9 WUHU-G opened 10 months ago
8
How to load the new model weight

#8 khalil-Hennara opened 10 months ago
1
Should I freeze norm.weight?

#7 metterian opened 10 months ago
1
full code to continue pre-training

#6 Abolfazl-kr opened 10 months ago
1
Question about Llama-7B and Llama-7B-Pro comparison.

#5 ryusaeba opened 10 months ago
2
Arxiv Data

#4 ZhengTang1120 opened 10 months ago
2
我们如何针对扩展区块微调?

#3 win10ogod opened 10 months ago
5
Code for training llama pro?

#2 yhyu13 opened 10 months ago
8
论文Table7请教

#1 XiaoYee closed 10 months ago
5