MERR audio prompt - Githubissues

Sorry, our server was attacked by a mining virus some time ago, and I forgot to back up the script code related to Qwen-Audio. I can only describe the prompt for extracting audio descriptions from memory.

Initially, we used a relatively simple prompt: You are a voice emotion expert. Please analyze the input audio and tell me the tone or pitch of the speaker in the audio.

However, we found that this prompt produced overly simplistic outputs, with Qwen-Audio tending to respond with 'positive' or 'negative.' After several attempts, we finally used the following prompt: You are a voice emotion expert. Please analyze the input audio and determine the tone of the speaker in the video from the following options: [joyful, sad, shocked, fearful, angry, positive, negative, calm, doubtful, dismissive].

Therefore, I suggest you try different prompts to generate the best descriptions.

ZebangCheng / Emotion-LLaMA

MERR audio prompt #17