A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Thanks for sharing this project and codebase.
May I know the GPU device config you used in the pre-trained Stage1/2 ? (e.g. what type of GPU and how many of them)
I didn't find it in the paper and project page.
Thanks.
Thanks for sharing this project and codebase. May I know the GPU device config you used in the pre-trained Stage1/2 ? (e.g. what type of GPU and how many of them) I didn't find it in the paper and project page. Thanks.