dimension of last_hidden_state

Hi,great work!But when i try to take a look at the shape of last_hidden_state,i encounter some problems.The codes are the same as official document.And that is from datasets import load_dataset from transformers import AutoProcessor, ClapAudioModel

dataset = load_dataset("ashraq/esc50") audio_sample = dataset["train"]["audio"][0]["array"]

model = ClapAudioModel.from_pretrained("laion/clap-htsat-fused") processor = AutoProcessor.from_pretrained("laion/clap-htsat-fused")

inputs = processor(audios=audio_sample, return_tensors="pt")

outputs = model(**inputs) last_hidden_state = outputs.last_hidden_state but the output is [1,768,2,32] which is not compatible to what i've seen in official document.It's expected to be last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) — Sequence of hidden-states at the output of the last layer of the model. Am i right or i miss some key information?

LAION-AI / CLAP

dimension of last_hidden_state #135