夜雨你好,我在跑这个项目时遇到如下错误:
问题描述:
我在选择使用mfcc处理音频时,错误如下:
Traceback (most recent call last):
File "infer_path.py", line 35, in
predictor = Predictor(model_path=args.model_path, vocab_path=args.vocab_path, use_model=args.use_model,
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 101, in init
self.predict(warmup_audio_path, to_an=False)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 173, in predict
outputdata, , _ = self.predictor(audio_data, audio_len, init_state_h_box, init_state_c_box)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 33, in forward
x = self.normalizer(audio)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 19, in forward
x = (x - self.mean) / (self.std + self.eps)
RuntimeError: The size of tensor a (39) must match the size of tensor b (161) at non-singleton dimension 1
在选择使用fbank处理音频时,错误如下:
Traceback (most recent call last):
File "infer_path.py", line 35, in
predictor = Predictor(model_path=args.model_path, vocab_path=args.vocab_path, use_model=args.use_model,
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 101, in init
self.predict(warmup_audio_path, to_an=False)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 173, in predict
outputdata, , _ = self.predictor(audio_data, audio_len, init_state_h_box, init_state_c_box)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 33, in forward
x = self.normalizer(audio)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 19, in forward
x = (x - self.mean) / (self.std + self.eps)
RuntimeError: The size of tensor a (120) must match the size of tensor b (161) at non-singleton dimension 1
夜雨你好,我在跑这个项目时遇到如下错误: 问题描述: 我在选择使用mfcc处理音频时,错误如下: Traceback (most recent call last): File "infer_path.py", line 35, in
predictor = Predictor(model_path=args.model_path, vocab_path=args.vocab_path, use_model=args.use_model,
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 101, in init
self.predict(warmup_audio_path, to_an=False)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 173, in predict
outputdata, , _ = self.predictor(audio_data, audio_len, init_state_h_box, init_state_c_box)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 33, in forward
x = self.normalizer(audio)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 19, in forward
x = (x - self.mean) / (self.std + self.eps)
RuntimeError: The size of tensor a (39) must match the size of tensor b (161) at non-singleton dimension 1
在选择使用fbank处理音频时,错误如下: Traceback (most recent call last): File "infer_path.py", line 35, in
predictor = Predictor(model_path=args.model_path, vocab_path=args.vocab_path, use_model=args.use_model,
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 101, in init
self.predict(warmup_audio_path, to_an=False)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\predict.py", line 173, in predict
outputdata, , _ = self.predictor(audio_data, audio_len, init_state_h_box, init_state_c_box)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 33, in forward
x = self.normalizer(audio)
File "C:\Users\Administrator\PycharmProjects\masr\venv\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(input, **kwargs)
File "C:\Users\Administrator\PycharmProjects\masr\MASR\masr\model_utils\utils.py", line 19, in forward
x = (x - self.mean) / (self.std + self.eps)
RuntimeError: The size of tensor a (120) must match the size of tensor b (161) at non-singleton dimension 1
这个要怎么解决呢 还有我想问问,使用mfcc或者fbank的效果一定会比线性的好吗 希望您能解惑