TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
https://tencentarc.github.io/LLaMA-Pro/
Apache License 2.0
443 stars 34 forks source link

Training on arbitary data #23

Open HelloWorldLTY opened 3 months ago

HelloWorldLTY commented 3 months ago

Thanks for your great work. I wonder if it is possible to directly use alpaca_lora or stanford_alpaca to finetune 8B model in arbitary dataset. Can we access the code? Or we directly use block_expand to create a new model and then train that new model? Does this support huggingface version? Thanks.

hills-code commented 3 months ago

Yes, you can directly finetune the 8B model with any datasets. You can access the model in the huggingface (https://huggingface.co/TencentARC/LLaMA-Pro-8B). You can use it just like the normal LLaMA model.

HelloWorldLTY commented 3 months ago

Thanks, will try.