Open Andrey36652 opened 1 month ago
and why not use natural language synthetic captioning for images?
Please also include the source of the videos. Additionally, provide the composition ratio of different categories as mentioned in the paper.
+1 for this question. Thanks!
Hello, thank you for the research. Please share more info about pre-training process.
Data:
Hardware: what kind of hardware were used, for how long, and maybe pretraining cost estimation.