gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
https://arxiv.org/abs/2408.16725
MIT License
3.06k stars 273 forks source link

Can we change base model? for LLM and TTS in batch processing? #63

Closed ankit-chotaliya closed 1 month ago

mini-omni commented 1 month ago

Hi, if you want to change the base model, you need to re-train the audio part.

SJenishaJ commented 1 month ago

yes i ready to retrain can i get the training script and overall documentation and dataset this is my email sbharathi16042001@gmail.com

mzeidhassan commented 1 month ago

hi @mini-omni can you please share documentation on how to retrain the audio part? Can I also train it on different language other than English?

mini-omni commented 1 month ago

hi @mini-omni can you please share documentation on how to retrain the audio part? Can I also train it on different language other than English?

hi, for the training process, you may refer to the tech report in the README, thx.

mini-omni commented 1 month ago

I'll close it for now, please feel free to re-open.