PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.06k stars 1.84k forks source link

[S2T] PaddleSpeech-Server-RESTful-API 不识别 pcm 格式,punc 参数不起作用 #3631

Open mzgcz opened 10 months ago

mzgcz commented 10 months ago

For support and discussions, please use our Discourse forums.

If you've found a bug then please create an issue with the following information:

Describe the bug PaddleSpeech-Server描述说语音识别服务支持pcm和wav两种格式,但输入pcm格式文件时,报以下错误:

raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name)) soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x7f9bc43d1b30>: Format not recognised. [2023-12-01 15:54:05,375] [ ERROR] - can not open the audio file, please check the audio file(<_io.BytesIO object at 0x7f9bc43d1b30>) format is 'wav'. you can try to use sox to change the file format. For example: sample rate: 16k sox input_audio.xx --rate 16k --bits 16 --channels 1 output_audio.wav sample rate: 8k sox input_audio.xx --rate 8k --bits 16 --channels 1 output_audio.wav

[2023-12-01 15:54:05,375] [ ERROR] - file check failed!

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context Add any other context about the problem here.

zxcd commented 9 months ago

传递字段audio_format { "audio": "exSI6ICJlbiIsCgkgICAgInBvc2l0aW9uIjogImZhbHNlIgoJf...", "audio_format": "pcm", "sample_rate": 16000, "lang": "zh_cn", "punc": 0 }

beixiang-l commented 9 months ago

wav和pcm格式传入punc参数都没用,没有补充标点符号

mzgcz commented 9 months ago

传递字段audio_format { "audio": "exSI6ICJlbiIsCgkgICAgInBvc2l0aW9uIjogImZhbHNlIgoJf...", "audio_format": "pcm", "sample_rate": 16000, "lang": "zh_cn", "punc": 0 }

我确认下:"audio_format": "pcm"时,audio对应的是纯pcm载荷吧?因为如果携带的是wav载荷是没有问题的。而且我查看代码也只是对wav载荷处理,没有对纯pcm载荷的处理。

warkcod commented 4 months ago

我试了下,punc的参数在wav的情况下一样是无效的... data = { "audio": base64_string, "audio_format": "wav", "sample_rate": 32000, "lang": "zh_cn", "punc": True }