Closed SnailForce closed 1 month ago
为了DDP训练,longclip修改了model.forward函数,你可以通过以下方法计算相似度 logits_per_image = image_features @ text_features.T probs = logits_per_image.softmax(dim=-1).cpu().numpy()
为了DDP训练,longclip修改了model.forward函数,你可以通过以下方法计算相似度 logits_per_image = image_features @ text_features.T probs = logits_per_image.softmax(dim=-1).cpu().numpy()
我发现了您写的demo.py,已经可以跑通了,但是您测试的demo.png,readme上显示结果为[0.982 0.01799],我在3090上跑出来的结果是[0.937 0.0628],差别好像很大,不知道是什么原因导致的。
您好,我们在编写demo后又更新了模型权重,可能会引起误差,您可以试一下在coco等数据集上进行评测,看看最后结果是否一致
import torch from PIL import Image from model import longclip
device = "cuda" if torch.cuda.is_available() else "cpu" model, preprocess = longclip.load("./checkpoints/longclip-B.pt", device=device)
image = preprocess(Image.open("./img/CLIP.png")).unsqueeze(0).to(device) text = longclip.tokenize(["a diagram", "a dog", "a cat"]).to(device)
with torch.no_grad(): image_features = model.encode_image(image) text_features = model.encode_text(text)
print("Label probs:", probs)
import torch import clip from PIL import Image
device = "cuda" if torch.cuda.is_available() else "cpu" model, preprocess = clip.load("ViT-B/32", device=device)
image = preprocess(Image.open("./img/CLIP.png")).unsqueeze(0).to(device) text = clip.tokenize(["a diagram", "a dog", "a cat"]).to(device)
with torch.no_grad(): image_features = model.encode_image(image) text_features = model.encode_text(text)
print("Label probs:", probs) # prints: [[0.9927937 0.00421068 0.00299572]]
上方是longclip的代码,下方是clip的代码,下方运行没有问题,上方运行报错 Traceback (most recent call last): File "Long-CLIP/test.py", line 15, in
logits_per_image, logits_per_text = model(image, text)
File "python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(args, **kwargs)
TypeError: CLIP.forward() missing 2 required positional arguments: 'text_short' and 'rank' 应该如何解决