AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head, arXiv'23

AkihikoWatanabe / paper_notes

たまに追加される論文メモ

https://AkihikoWatanabe.github.io/paper_notes

13 stars 0 forks source link

Open AkihikoWatanabe opened 1 year ago

AkihikoWatanabe commented 1 year ago

AkihikoWatanabe commented 1 year ago

text, audio, imageといったマルチモーダルなpromptから、audioに関する様々なタスクを実現できるシステム

AkihikoWatanabe commented 1 year ago

マルチモーダルデータをjointで学習したというわけではなく、色々なモデルの組み合わせてタスクを実現しているっぽい