ymcui / Chinese-LLaMA-Alpaca-2

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Apache License 2.0
7.05k stars 577 forks source link

chinese-alpaca-2-13b-16k 长文本训练过程咨询 #375

Closed zx4321 closed 10 months ago

zx4321 commented 10 months ago

提交前必须检查以下项目

问题类型

其他问题

基础模型

Chinese-Alpaca-2-16K (7B/13B)

操作系统

Linux

详细描述问题

在Code Llama中,通过设置rope_theta来让长文本的效果更好,请问当前chinese-alpaca-2-13b-16k训练过程中修改了这个参数吗?
"rope_theta": 1000000(https://huggingface.co/codellama/CodeLlama-7b-hf/blob/main/config.json)

https://scontent-hkt1-2.xx.fbcdn.net/v/t39.2365-6/369856151_1754812304950972_1159666448927483931_n.pdf?_nc_cat=107&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=yHzAghoBhMQAX8GuEYS&_nc_ht=scontent-hkt1-2.xx&oh=00_AfDTA9ubVw-yP_LkldI2QdMuCfhaKtoiO7VLDl_-W3EQfg&oe=6543B50F

依赖情况(代码类问题务必提供)

运行日志或截图

iMountTai commented 10 months ago

参考 release 3.0

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.