Open ryusaeba opened 10 months ago
We have not done this experiment yet. We may consider to do this later. Currently, we are going to do the expansion to Mistral and multi-modal models.
Understood. Please share with me if you have any update. Also looking forward your expansion to Mistral and Multi-Modal models.
As Llama-7B-Pro uses additional 80B pretrain data for improving math and code, did uses the same 80B pretrained data on Llama-7B directly? If so, how's the results?