ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
https://arxiv.org/abs/2409.06666
Apache License 2.0
2.61k stars 175 forks source link

Can you release the codes for getting discrete units? #28

Open isruihu opened 2 months ago

isruihu commented 2 months ago

Hi there, Good job! Do you have a plan to release the codes for getting discrete units (using Hubert and a K-means model)? or give a guidance on where to download the required encoder model.

thanks!