TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
https://tencentarc.github.io/LLaMA-Pro/
Apache License 2.0
482 stars 35 forks source link

Question about Llama-7B and Llama-7B-Pro comparison. #5

Open ryusaeba opened 10 months ago

ryusaeba commented 10 months ago

As Llama-7B-Pro uses additional 80B pretrain data for improving math and code, did uses the same 80B pretrained data on Llama-7B directly? If so, how's the results?

hills-code commented 10 months ago

We have not done this experiment yet. We may consider to do this later. Currently, we are going to do the expansion to Mistral and multi-modal models.

ryusaeba commented 10 months ago

Understood. Please share with me if you have any update. Also looking forward your expansion to Mistral and Multi-Modal models.