OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
779 stars 61 forks source link

please provide a TTS prompt example for chat. #27

Closed sipie800 closed 4 months ago

sipie800 commented 4 months ago

I tried TTS from this format,

interleaved|Read this sentence aloud, this is input: Today is a sunny day.|speech

it just texts back with "doesn't it", no audio is generated.

And besides, with some other attempts, it goes to "modality hallucinations". Draw a image, make a music. Just no TTS.

What is the prompt? Or it can't do with chat?

JunZhan2000 commented 4 months ago

This is because we did not include such instructions as "reading out a certain paragraph" when doing SFT. Any2Any has too many tasks to take care of all of them. You can let the model generate speech directly through voice dialogue, or use the base model directly for TTS.