LAION-AI / CLAP

Contrastive Language-Audio Pretraining
https://arxiv.org/abs/2211.06687
Creative Commons Zero v1.0 Universal
1.41k stars 135 forks source link

Text embedding code failing for single prompt #85

Open zqevans opened 1 year ago

zqevans commented 1 year ago

When I try to use clap_model.get_text_embedding() on an array with a single prompt in it, the call fails with an error in the Roberta tokenizer. It seems that it's confused about the shape of the array unless there's more than one element in it.

File ".../transformers/models/roberta/modeling_roberta.py", line 802, in forward batch_size, seq_length = input_shape ValueError: not enough values to unpack (expected 2, got 1)

lukewys commented 1 year ago

Hi, are you inputting the string to the function? In that function, we expect a list. So you could make that string as a text (length 1). Please let us know how you do it.

zqevans commented 1 year ago

I'm inputting an array of length 1.

prompt = "Text prompt"
prompts = [prompt] * args.batch_size
text_embeddings = clap_model.get_text_embedding(prompts)

with args.batch_size is set to 1, this code fails. It works with larger batch sizes.

kamalojasv181 commented 1 year ago

I can confirm this. I am facing the same issue. Thanks

lematt1991 commented 12 months ago

105 should fix this. Are there plans to merge it?

hareisland commented 9 months ago

need more than 2 text to get_text_embedding

hareisland commented 9 months ago

need more than 2 text to get_text_embedding

already made it a list (all_text_list[0:1]) but not work at all