microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.61k stars 2.5k forks source link

Is Kosmos 2.5 better for image captioning than Kosmos 2? and a Gradio app we need #1549

Closed FurkanGozukara closed 4 months ago

FurkanGozukara commented 4 months ago

I see only OCR but I want to be sure

Also can you publish a Gradio app for kosmos 2.5?

Dod-o commented 4 months ago

Kosmos 2.5 is designed for OCR and Markdown generation tasks. It doesn't have a caption generation feature.

FurkanGozukara commented 4 months ago

Kosmos 2.5 is designed for OCR and Markdown generation tasks. It doesn't have a caption generation feature.

thanks for clarification