ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
https://arxiv.org/abs/2409.06666
Apache License 2.0
2.62k stars 177 forks source link

Can we convert this into .GGUF format without breaking? #9

Closed Kingbadger3d closed 2 months ago

Kingbadger3d commented 2 months ago

Have you guys started on quantized version support, e.g BF16, Q8_0 etc. Cheers

Poeroz commented 2 months ago

Sorry, we haven't started on supporting quantized models at this time. We might explore this in the future.