TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
https://tencentarc.github.io/LLaMA-Pro/
Apache License 2.0
449 stars 34 forks source link

Do we need to freeze embedding layer and the lm_head as well during the Llama-pro style training ? #21

Closed shamanez closed 3 months ago

hills-code commented 3 months ago

yes, I freeze those two modules as well in the training.

shamanez commented 3 months ago

Thanks a lot