0nutation / SpeechGPT

SpeechGPT Series: Speech Large Language Models
https://0nutation.github.io/SpeechGPT.github.io/
Apache License 2.0
1.04k stars 64 forks source link

Some quenstion about WER and other evaluation metrics #12

Closed ZhikangNiu closed 9 months ago

ZhikangNiu commented 9 months ago

Thanks for your amazing work and I want to know if you have tested the ASR Task evaluation metrics (eg: wer) and other metrics that can be quantified? And when I use ASR task, I also find some error ouput, for example:

  1. Repeat the output of the same word
  2. The model asks and answers itself. For example: the audio is "what's your favorite color?" and my prompt is your ASR prompt. But the model's output is "blue"
  3. some special token, including [ua],[ta],, I want to konw what is causing this.
  4. None response Looking forward to your early reply.
0nutation commented 9 months ago

We haven't tested the ASR performance of SpeechGPT on a standard dataset. The performance of the 7B model is still relatively not perfect, and its robustness on tasks like ASR is not satisfactory. Regarding the error output, due to limitations in the training data, the model may misidentify tasks, such as mistaking an ASR task for a speech dialogue task. Special tokens like [ua] and [ta] stand for 'unit answer' and 'text answer' respectively. This design is part of our chain-of-modality, and you can refer to the paper and cases of SpeechInstruct chain-of-modality datase for more details.

ZhikangNiu commented 9 months ago

Thanks for your answer, SpeechGPT is a promising work. Hope you can release the 13B model as soon as possible. But I also find the model didn't response (I mean response in empty), this is not a rare phenomenon. Besides, when we use the model for asr task, we don't want special token to appear. How do I avoid these special situations?

0nutation commented 9 months ago

If you want to use speechgpt for asr or tts tasks, it is recommended to use only the SpeechGPT-7B-CM and no SpeechGPT-7B-com adatper.

ZhikangNiu commented 9 months ago

If you want to use speechgpt for asr or tts tasks, it is recommended to use only the SpeechGPT-7B-CM and no SpeechGPT-7B-com adatper.

thx for your answer, I will test and share you results