pharmapsychotic / clip-interrogator

Image to prompt with BLIP and CLIP
MIT License
2.71k stars 431 forks source link

Interrogator 1: RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0 #104

Open stevenhales opened 1 year ago

stevenhales commented 1 year ago

Hi pharma, are you still supporting the original version? I still use it for Disco

RuntimeError                              Traceback (most recent call last)
[<ipython-input-13-04ebd4716986>](https://localhost:8080/#) in <cell line: 47>()
     45 display(thumb)
     46 
---> 47 interrogate(image, models=models)

23 frames
[/content/BLIP/models/med.py](https://localhost:8080/#) in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, past_key_value, output_attentions)
    176 
    177         # Take the dot product between "query" and "key" to get the raw attention scores.
--> 178         attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
    179 
    180         if self.position_embedding_type == "relative_key" or self.position_embedding_type == "relative_key_query":

RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0