microsoft / CLAP

Learning audio concepts from natural language supervision
MIT License
486 stars 38 forks source link

When using clapcap model,`get_audio_embeddings` throw error #43

Closed BlackLost closed 1 week ago

BlackLost commented 1 week ago

This is my code ,when I run it ,it throw Error.

from msclap import CLAP
from deep_translator import GoogleTranslator
clap_model = CLAP(model_fp = './model/clapcap_weights_2023.pth',version='clapcap', use_cuda=False)

audio_files = ['./sound/shoe.WAV']
audio_vector = clap_model.get_audio_embeddings(audio_files)

Error:

Traceback (most recent call last):
  File "/Users/xxx/anaconda3/envs/clap/lib/python3.9/site-packages/msclap/CLAPWrapper.py", line 296, in _get_audio_embeddings
    return self.clap.audio_encoder(preprocessed_audio)[0]
AttributeError: 'CLAPWrapper' object has no attribute 'clap'

The reason is that in CLAPWarpper.py,Line 58,when using clapcap,it will return clapcap instead of clap.

self.model_fp = model_fp
        self.use_cuda = use_cuda
        if 'clapcap' in  version:
            self.clapcap, self.tokenizer, self.args = self.load_clapcap()
        else:
            self.clap, self.tokenizer, self.args = self.load_clap()

My English is poor,hope for your reply

bmartin1 commented 1 week ago

Hi, you are calling clapcap model:

clap_model = CLAP(model_fp = './model/clapcap_weights_2023.pth',version='clapcap', use_cuda=False)

Please, follow the GitHub usage to call clap: clap_model = CLAP(version = '2023', use_cuda=False)