Closed Haonote closed 7 months ago
Thanks for your attention, the model is trained on 256 A100 GPUs for 30 hours. You can use fewer GPUs (128 / 64) to train the model (needs more training time).
@jy0205 Thanks for your prompt reply to let me know this information. I would also like to ask if the training code will be made public. If not, is it feasible for me to write the training code myself to fine-tune it?
Sorry, the pre-training code is not planned to be open-sourced currently. You can write the fine-tuning code to adapt the model for your specific needs.
So Great Job! It will change the multi-modal paradigm, any plan to release the finetuning code, just the finetuning code like Qwen-VL?
Hello, your model is impressive. I would like to ask what kind of GPU do I need to train this model and how many GPUs do I need?