Do we need to freeze embedding layer and the lm_head as well during the Llama-pro style training ?

TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

https://tencentarc.github.io/LLaMA-Pro/

Apache License 2.0

449 stars 34 forks source link

Closed shamanez closed 3 months ago

hills-code commented 3 months ago

yes, I freeze those two modules as well in the training.

shamanez commented 3 months ago

Thanks a lot