Closed chricro closed 4 months ago
Unfortunately, we were unable to use LLaMA models for CuMo v1 due to licensing constraints with our ByteDance collaborators. However, we encourage the open-source community to explore CuMo with these models from Meta as we’ve open-sourced all related data and training code. We may also explore it in the future from the academia side.
Hi,
Thank your for your work. Do you plan to release CuMo-Llama-3 models (8b / 70b) instead of Mistral ? It could improve the performance even further, what do you think ?