facebookresearch / ImageBind

ImageBind One Embedding Space to Bind Them All
Other
8.09k stars 738 forks source link

The issue about Audio to Image Generation #40

Open liu-zhy opened 1 year ago

liu-zhy commented 1 year ago

An amazing work!!!

It's well known that https://github.com/lucidrains/DALLE2-pytorch and https://github.com/LAION-AI/dalle2-laion used open-clip as pretrianed text and image encoder. However, I have noticed that you used a private DALLE-2 to generate the image conditioned on audio.

Whether is it possible to use open source DALLE-2 instea of private reimplemented counterpart? Does it have some problems with open source DALLE-2? I would appreciate if you can share experience.

In my view, If it was possible to use open source DALLE-2 to adapt the ImageBind, it could directly create some very interesting applications and increase the impact of this work!

liu-zhy commented 1 year ago

Can someone help me? Thanks!

xuxy09 commented 1 year ago

We tried audio to image using Stable Diffusion. The project is open-sourced: https://github.com/sail-sg/BindDiffusion

liu-zhy commented 1 year ago

We tried audio to image using Stable Diffusion. The project is open-sourced: https://github.com/sail-sg/BindDiffusion

Wow, great work, I have starred this repo!