PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.14k stars 1.85k forks source link

paddlespeech - list index out of range #3411

Open zhanghzong opened 1 year ago

zhanghzong commented 1 year ago

环境

  1. win11
  2. conda Python 3.10.11
  3. pip 23.1.2
  4. paddlepaddle==2.5.0
  5. paddlespeech==1.4.1
  6. paddleaudio==1.0.1
  7. vs 2022

使用

参照源 https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md

代码

import paddle
from paddlespeech.cli.asr import ASRExecutor

asr_executor = ASRExecutor()
text = asr_executor(
    model='conformer_wenetspeech',
    lang='zh',
    sample_rate=16000,
    config=None,  # Set `config` and `ckpt_path` to None to use pretrained model.
    ckpt_path=None,
    audio_file='./zh.wav',
    force_yes=False,
    device=paddle.get_device())
print('ASR Result: \n{}'.format(text))

错误详情

2023-07-21 11:09:44.658 | INFO     | paddlespeech.s2t.modules.ctc:<module>:45 - paddlespeech_ctcdecoders not installed!
2023-07-21 11:09:44.734 | INFO     | paddlespeech.s2t.modules.embedding:__init__:150 - max len: 5000
[2023-07-21 11:09:48,553] [   ERROR] - list index out of range
Traceback (most recent call last):
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer
    result_transcripts = self.model.decode(
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddle\fluid\dygraph\base.py", line 347, in _decorate_function
    return func(*args, **kwargs)
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode
    hyp = self.attention_rescoring(
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 532, in attention_rescoring
    assert speech.shape[0] == speech_lengths.shape[0]
IndexError: list index out of range
Traceback (most recent call last):

报错源码

    def attention_rescoring(self,
                            speech: paddle.Tensor,
                            speech_lengths: paddle.Tensor,
                            beam_size: int,
                            decoding_chunk_size: int=-1,
                            num_decoding_left_chunks: int=-1,
                            ctc_weight: float=0.0,
                            simulate_streaming: bool=False,
                            reverse_weight: float=0.0) -> List[int]:

        assert speech.shape[0] == speech_lengths.shape[0]
Chuyaoyuan commented 1 year ago

尝试下直接编译paddlespeech的最新develop分支吧

zhanghzong commented 1 year ago

尝试下直接编译paddlespeech的最新develop分支吧

试试看

zhanghzong commented 1 year ago

特来补充说明. 编译 develop 分支 可以解决上述问题.

感谢 @Chuyaoyuan 提示

zhanghzong commented 1 year ago

新的问题

开发分支: develop 存在以下问题

Traceback (most recent call last):
  File "F:\Program\anaconda3\envs\paddle.1\Scripts\paddlespeech-script.py", line 33, in <module>
    sys.exit(load_entry_point('paddlespeech==0.0.0', 'console_scripts', 'paddlespeech')())
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\cli\entry.py", line 40, in _execute
    exec("from {} import {}".format(module, cls))
  File "<string>", line 1, in <module>
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\cli\tts\__init__.py", line 14, in <module>
    from .infer import TTSExecutor
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\cli\tts\infer.py", line 33, in <module>
    from paddlespeech.t2s.exps.syn_utils import get_am_inference
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\t2s\exps\syn_utils.py", line 36, in <module>
    from paddlespeech.t2s.frontend.canton_frontend import CantonFrontend
  File "F:\Program\anaconda3\envs\paddle.1\lib\site-packages\paddlespeech\t2s\frontend\canton_frontend.py", line 19, in <module>
    import ToJyutping
  File "f:\program\anaconda3\envs\paddle.1\lib\site-packages\ToJyutping\__init__.py", line 3, in <module>
    from .ToJyutping import *
  File "f:\program\anaconda3\envs\paddle.1\lib\site-packages\ToJyutping\ToJyutping.py", line 9, in <module>
    for line in f:
UnicodeDecodeError: 'gbk' codec can't decode byte 0x96 in position 2: illegal multibyte sequence

语音合成有问题

zxcd commented 1 year ago

文件编码问题,你看一下你的输入是否包括非法字符或者系统默认编码有问题。

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.