PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.07k stars 1.85k forks source link

[S2T]安装未报错,使用就报错,这个问题折腾了3天了,真不夸张 #3577

Open linglin506 opened 11 months ago

linglin506 commented 11 months ago

我在docker中安装好了paddle2.5.0 paddlespeech1.4.1 过程无报错 也使用了官方提供的示例zh.wav文件, 执行paddlespeech asr --lang zh --input ./zh.wav ,直接报如下错误,不管我重装还是怎么回事都这样,docker容器系统是ubuntu20.04,cuda11.7 cudnn8.4.1

root@8ea5a783c6ce:/paddle/asr/test# paddlespeech asr --lang zh --input ./zh.wav
W1106 20:15:33.654063 5856 gpu_resources.cc:96] The GPU architecture in your current machine is Pascal, which is not compatible with Paddle installation with arch: 70 75 80 86 , it is recommended to install the corresponding wheel package according to the installation information on the official Paddle website. W1106 20:15:33.654111 5856 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 11.7, Runtime API Version: 11.7 W1106 20:15:33.660698 5856 gpu_resources.cc:149] device: 0, cuDNN Version: 8.4. [2023-11-06 20:15:35,755] [ ERROR] - list index out of range Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/paddlespeech/cli/asr/infer.py", line 314, in infer result_transcripts = self.model.decode( File "/usr/local/lib/python3.8/dist-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "/usr/local/lib/python3.8/dist-packages/paddle/fluid/dygraph/base.py", line 347, in _decorate_function return func(args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/paddlespeech/s2t/models/u2/u2.py", line 818, in decode hyp = self.attention_rescoring( File "/usr/local/lib/python3.8/dist-packages/paddlespeech/s2t/models/u2/u2.py", line 532, in attention_rescoring assert speech.shape[0] == speech_lengths.shape[0] IndexError: list index out of range

yaleimeng commented 11 months ago

我一年来在不同机器环境安装了不下十次。近期也安装了三五次,都没使用docker, 安装比较顺利。就是依赖包版本比较复杂,requirements依赖已经过时,运行代码各种报错需要自己研究issue排查解决。 建议paddlespeech尽快发布新版本,把依赖包捋顺,不要给使用者添麻烦。初学者大多没有能力解决安装报错和运行示例发生的错误。

zxcd commented 11 months ago

版本依赖关系可以看看这个 https://github.com/PaddlePaddle/PaddleSpeech/issues/3528

linglin506 commented 11 months ago

我问题解决了,确实是依赖关系的问题。我是P40的GPU 最新搭配只能是paddle2.4.2 speech1.4.0

sunqinbo commented 10 months ago

安装版本 +问题处理 (speech_env) dongao@deepin207:/data/works/html/data/source$ pip list|grep paddle paddle-bfloat 0.1.7 paddle2onnx 1.1.0 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlenlp 2.5.2 paddlepaddle 2.5.2 paddlesde 0.2.5 paddleslim 2.4.1 paddlespeech 1.4.1 paddlespeech-ctcdecoders 0.2.0 paddlespeech-feat 0.1.0

问题处理: 1 AttributeError: module 'numpy' has no attribute 'complex'. np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here. The aliases was originally deprecated in NumPy 1.20; 由于 paddlespeech 依赖中要求 librosa 版本0.8.1,而且 librosa库使用个了NumPy1.20之前的API导致的

pip install librosa==0.10.1

2 报错信息提示:ModuleNotFoundError: No module named 'paddle.nn.layer.layers' 报错信息:IndexError: list index out of range 由于 paddlepaddle 2.6.1版本 ,降级 paddlenlp 版本到 2.5.2问题得到解决 pip install paddlenlp==2.5.2

运行例子 paddlespeech text --task punc -v --input "你知道吗大部分时候你都在假装学习你在年初提出了一整面墙的todo list到了九月还没打上几个钩你的书架上陈列着全网最畅销的专业书籍" [2023-12-20 17:56:18,543] [ INFO] - loading configuration file /home/dongao/.paddlespeech/models/ernie_linear_p7_wudao-punc-zh/1.0/ernie_linear_p7_wudao-punc-zh.tar/ckpt/config.json [2023-12-20 17:56:18,545] [ INFO] - Model config ErnieConfig { "architectures": [ "ErnieForTokenClassification" ], "attention_probs_dropout_prob": 0.1, "enable_recompute": false, "fuse": false, "hidden_act": "relu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "LABEL_0", "1": "LABEL_1", "2": "LABEL_2", "3": "LABEL_3", "4": "LABEL_4", "5": "LABEL_5", "6": "LABEL_6", "7": "LABEL_7" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "LABEL_0": 0, "LABEL_1": 1, "LABEL_2": 2, "LABEL_3": 3, "LABEL_4": 4, "LABEL_5": 5, "LABEL_6": 6, "LABEL_7": 7 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 513, "model_type": "ernie", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "paddlenlp_version": null, "pool_act": "tanh", "task_id": 0, "task_type_vocab_size": 3, "type_vocab_size": 2, "use_task_id": false, "vocab_size": 18000 }

[2023-12-20 17:56:33,740] [ INFO] - All model checkpoint weights were used when initializing ErnieForTokenClassification.

[2023-12-20 17:56:33,740] [ INFO] - All the weights of ErnieForTokenClassification were initialized from the model checkpoint at /home/dongao/.paddlespeech/models/ernie_linear_p7_wudao-punc-zh/1.0/ernie_linear_p7_wudao-punc-zh.tar/ckpt. If your task is similar to the task the model of the checkpoint was trained on, you can already use ErnieForTokenClassification for predictions without further training. [2023-12-20 17:56:33,761] [ INFO] - Already cached /home/dongao/.paddlenlp/models/ernie-1.0/vocab.txt [2023-12-20 17:56:33,776] [ INFO] - tokenizer config file saved in /home/dongao/.paddlenlp/models/ernie-1.0/tokenizer_config.json [2023-12-20 17:56:33,777] [ INFO] - Special tokens file saved in /home/dongao/.paddlenlp/models/ernie-1.0/special_tokens_map.json 你知道吗?大部分时候,你都在假装学习。你在年初提出了一整面墙的todolist,到了九月,还没打上几个钩。你的书架上,陈列着全网最畅销的专业书籍。

yaleimeng commented 10 months ago

@sunqinbo 感谢分享。 不过,一个正常迭代的代码库,应该无论何时,pip安装最新的release版本大概率直接可以跑通案例。 从这个角度看,这个代码库已经年久失修,差不多废弃了。

linglin506 commented 9 months ago

感谢你们的帮助,问题已经处理。但愿飞浆能越来越好,相关代码和文档能更加的完善,不要跟百度搜索一样不靠谱

sinopec commented 7 months ago

还是各种依赖问题,照理说,我装最新的stable版本 不应该出现依赖问题的—— 期望能尽快解决,或者至少给出一个requirement.txt文件来,到底依赖各种库的版本是哪个。

klzhong69 commented 7 months ago

弃坑啦!!!官方环境,官方代码,搞了两天,都跑不通。