unum-cloud / uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
https://unum-cloud.github.io/uform/
Apache License 2.0
982 stars 56 forks source link

how can I decode image feature to text? #77

Closed zshnb closed 3 months ago

zshnb commented 3 months ago

as the code in README, when I get image_features, how can I decode image_features to text?

ashvardanian commented 3 months ago

Hi @zshnb! Which model and which part of the README are you referring to?

zshnb commented 3 months ago

Hi @zshnb! Which model and which part of the README are you referring to?

Hi here

image

after output the feature, how can I decode feature to text?

ashvardanian commented 3 months ago

That can't be done with those models. You can only use those features for reranking.

If you are looking to produce text afterwards, look into the generative model mentioned lower in the file 🤗