TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
https://tencentarc.github.io/LLaMA-Pro/
Apache License 2.0
449 stars 34 forks source link

Question about Llama-7B and Llama-7B-Pro comparison. #5

Open ryusaeba opened 6 months ago

ryusaeba commented 6 months ago

As Llama-7B-Pro uses additional 80B pretrain data for improving math and code, did uses the same 80B pretrained data on Llama-7B directly? If so, how's the results?

hills-code commented 6 months ago

We have not done this experiment yet. We may consider to do this later. Currently, we are going to do the expansion to Mistral and multi-modal models.

ryusaeba commented 6 months ago

Understood. Please share with me if you have any update. Also looking forward your expansion to Mistral and Multi-Modal models.