Closed xiaohaochen0308 closed 11 months ago
import torch from languagebind import LanguageBindImage, LanguageBindImageTokenizer, LanguageBindImageProcessor
pretrained_ckpt = 'LanguageBind/LanguageBind_Image' model = LanguageBindImage.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') tokenizer = LanguageBindImageTokenizer.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') image_process = LanguageBindImageProcessor(model.config, tokenizer)
model.eval() data = image_process([r"your/image.jpg"], ['your text.'], return_tensors='pt') with torch.no_grad(): out = model(**data)
print(out.text_embeds @ out.image_embeds.T)
您好, 请问我如果加载LanguageBind_Image模型,用于图像和文本特征的提取对齐,那么我是用 out.text_embed和 out.image_embeds 这两个进行后续的工作吗?比如后续进行融合分类。
[Chinese] 是的。 [En] Yes.
import torch from languagebind import LanguageBindImage, LanguageBindImageTokenizer, LanguageBindImageProcessor
pretrained_ckpt = 'LanguageBind/LanguageBind_Image' model = LanguageBindImage.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') tokenizer = LanguageBindImageTokenizer.from_pretrained(pretrained_ckpt, cache_dir='./cache_dir') image_process = LanguageBindImageProcessor(model.config, tokenizer)
model.eval() data = image_process([r"your/image.jpg"], ['your text.'], return_tensors='pt') with torch.no_grad(): out = model(**data)
print(out.text_embeds @ out.image_embeds.T)
您好, 请问我如果加载LanguageBind_Image模型,用于图像和文本特征的提取对齐,那么我是用 out.text_embed和 out.image_embeds 这两个进行后续的工作吗?比如后续进行融合分类。