PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
10.98k stars 1.83k forks source link

whisper_executor TypeError: too many positional arguments #3764

Open yanming499 opened 4 months ago

yanming499 commented 4 months ago

反复试了很多版本speech、panddle-gpu cpu 的版本 都会出现以下问题,还请高手指点

————————————————————和paddle相关的包的版本—————————————————— paddle-bfloat 0.1.7 paddle2onnx 1.0.6 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlenlp 2.6.0 paddleocr 2.7.3 paddlepaddle-gpu 2.6.1.post120 paddlesde 0.2.5 paddleslim 2.3.4 paddlespeech 1.4.0 paddlespeech-feat 0.1.0

————————————————————————代码———————————————————————— import paddle from paddlespeech.cli.whisper import WhisperExecutor

whisper_executor = WhisperExecutor() text = whisper_executor( model='whisper', task='transcribe', sample_rate=16000, config=None, # Set config and ckpt_path to None to use pretrained model. ckpt_path=None, audio_file='./zh.wav', device=paddle.get_device()) print('ASR Result: \n{}'.format(text))

————————————————————————运行情况(正常部分)———————————————————————— C:\Users\willi\miniconda3\envs\py38\lib\site-packages_distutils_hack__init.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") W0515 16:36:08.184197 7104 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.3, Runtime API Version: 12.0 W0515 16:36:08.191708 7104 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9. Detecting language using up to the first 30 seconds. Use --language to specify the language [2024-05-15 16:36:12,921] [ INFO] - Assigning ['<|startoftranscript|>', '<|en|>', '<|zh|>', '<|de|>', '<|es|>', '<|ru|>', '<|ko|>', '<|fr|>', '<|ja|>', '<|pt|>', '<|tr|>', '<|pl|>', '<|ca|>', '<|nl|>', '<|ar|>', '<|sv|>', '<|it|>', '<|id|>', '<|hi|>', '<|fi|>', '<|vi|>', '<|iw|>', '<|uk|>', '<|el|>', '<|ms|>', '<|cs|>', '<|ro|>', '<|da|>', '<|hu|>', '<|ta|>', '<|no|>', '<|th|>', '<|ur|>', '<|hr|>', '<|bg|>', '<|lt|>', '<|la|>', '<|mi|>', '<|ml|>', '<|cy|>', '<|sk|>', '<|te|>', '<|fa|>', '<|lv|>', '<|bn|>', '<|sr|>', '<|az|>', '<|sl|>', '<|kn|>', '<|et|>', '<|mk|>', '<|br|>', '<|eu|>', '<|is|>', '<|hy|>', '<|ne|>', '<|mn|>', '<|bs|>', '<|kk|>', '<|sq|>', '<|sw|>', '<|gl|>', '<|mr|>', '<|pa|>', '<|si|>', '<|km|>', '<|sn|>', '<|yo|>', '<|so|>', '<|af|>', '<|oc|>', '<|ka|>', '<|be|>', '<|tg|>', '<|sd|>', '<|gu|>', '<|am|>', '<|yi|>', '<|lo|>', '<|uz|>', '<|fo|>', '<|ht|>', '<|ps|>', '<|tk|>', '<|nn|>', '<|mt|>', '<|sa|>', '<|lb|>', '<|my|>', '<|bo|>', '<|tl|>', '<|mg|>', '<|as|>', '<|tt|>', '<|haw|>', '<|ln|>', '<|ha|>', '<|ba|>', '<|jw|>', '<|su|>', '<|translate|>', '<|transcribe|>', '<|startoflm|>', '<|startofprev|>', '<|nospeech|>', '<|notimestamps|>'] to the additional_special_tokens key of the tokenizer [2024-05-15 16:36:12,921] [ INFO] - Adding <|startoftranscript|> to the vocabulary [2024-05-15 16:36:12,931] [ INFO] - Adding <|en|> to the vocabulary [2024-05-15 16:36:12,931] [ INFO] - Adding <|zh|> to the vocabulary [2024-05-15 16:36:12,931] [ INFO] - Adding <|de|> to the vocabulary [2024-05-15 16:36:12,931] [ INFO] - Adding <|es|> to the vocabulary [2024-05-15 16:36:12,931] [ INFO] - Adding <|ru|> to the vocabulary .... .... —————————————————————运行情况(开始出问题了)————————————————————— Traceback (most recent call last): File "c:/Users/willi/Desktop/speechtest/4.py", line 7, in text = whisper_executor( File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\cli\utils.py", line 328, in _warpper return executor_func(self, *args, kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\cli\whisper\infer.py", line 477, in call self.infer(model) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function return func(args, kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\cli\whisper\infer.py", line 279, in infer self.outputs["result"] = self.model.transcribe( File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 488, in transcribe , probs = model.detect_language(segment, resource_path) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function return func(args, **kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 392, in detect_language mel = model.encoder(mel) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call return self.forward(*inputs, **kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 207, in forward x = block(x) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call__ return self.forward(*inputs, kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 148, in forward x = x + self.attn(self.attn_ln(x), mask=mask, kv_cache=kv_cache) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call return self.forward(*inputs, *kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 100, in forward wv = self.qkv_attention(q, k, v, mask) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\paddlespeech\s2t\models\whisper\whipser.py", line 111, in qkv_attention q.view(q.shape[:2], self.n_head, -1), (0, 2, 1, 3)) scale File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\decorator.py", line 231, in fun args, kw = fix(args, kw, sig) File "C:\Users\willi\miniconda3\envs\py38\lib\site-packages\decorator.py", line 203, in fix ba = sig.bind(args, kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\inspect.py", line 3037, in bind return self._bind(args, kwargs) File "C:\Users\willi\miniconda3\envs\py38\lib\inspect.py", line 2958, in _bind raise TypeError('too many positional arguments') from None TypeError: too many positional arguments

Ray961123 commented 4 months ago

开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,目前开源项目维护人力有限,建议参考:https://github.com/PaddlePaddle/PaddleSpeech/issues/3560