phellonchen / X-LLM

X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
https://x-llm.github.io
Apache License 2.0
304 stars 17 forks source link

No way to train speech interface! #13

Open rohan1561 opened 1 year ago

rohan1561 commented 1 year ago

Hello, the speech datasets' links for json and features all link back to the xllm repo. The speech encoder link also links back to the main page of xllm. Where do we find the blip2 checkpoints? Do we use their Q former only?