facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All
Other
8.38k stars 771 forks source link

Simply replacing Detic's CLIP-based ‘class’ enbedding with imagebind audio embedding #91

Closed youngstear closed 4 months ago

youngstear commented 1 year ago

Thanks for your good jobs!!! I tried this, audio embedding dim of imagebind is 1024, but Detic model need embedding of 512 dim,Can you release matched model?For example,imagebind_base.pth?

ChrisjanWust commented 4 months ago

Unfortunately you'll need to train your own conversion model to convert to Detic's embedding space. All ImageBind embeddings are 1024dims.