Closed lightfate closed 1 year ago
Could you pass the option
--debug=true
when you invoke
python offline-decode-files.py
and show the output?
Could you pass the option
--debug=true
when you invoke
python offline-decode-files.py
and show the output? like this: (venv) PS D:\004-Workspace\pycharm\sherpa> python offline-decode-files.py --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt --encoder=./sherpa-onnx-zipformer -en-2023-04-01/encoder-epoch-99-avg-1.onnx --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx --joiner=./sherpa-onnx-zipformer-en-2023-04-01 /joiner-epoch-99-avg-1.onnx ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav ./sherpa-onnx-zipformer-en-20 23-04-01/test_wavs/8k.wav sherpa --debug=true D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-transducer-model.cc:InitEncoder:141 ---encoder--- encoder_dims=384,384,384,384,384 version=1 model_type=zipformer model_author=k2-fsa attention_dims=192,192,192,192,192 decode_chunk_len=32 num_encoder_layers=2,4,3,2,4 T=39 cnn_module_kernels=31,31,31,31,31 left_context_len=64,32,16,8,32
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-transducer-model.cc:InitDecoder:161 ---decoder--- vocab_size=6254 context_size=2
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-transducer-model.cc:InitJoiner:185 ---joiner--- joiner_dim=512
Started! D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-stream.cc:AcceptWaveformImpl:108 Creating a resampler: in_sample_rate: 8000 output_sample_rate: 16000
Traceback (most recent call last):
File "offline-decode-files.py", line 340, in
Here is the output on my side when using --debug=true
:
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-transducer-model.cc:InitEncoder:141 ---encoder---
model_author=k2-fsa
model_type=zipformer
version=1
comment=stateless7
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-transducer-model.cc:InitDecoder:161 ---decoder---
vocab_size=500
context_size=2
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-transducer-model.cc:InitJoiner:185 ---joiner---
joiner_dim=512
Started!
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/offline-stream.cc:AcceptWaveformImpl:108 Creating a resampler:
in_sample_rate: 8000
output_sample_rate: 16000
Please make sure you are using the correct model.
Please show the sha256sum of your model files.
(py38) fangjuns-MacBook-Pro:sherpa-onnx-zipformer-en-2023-04-01 fangjun$ shasum -a 256 encoder-epoch-99-avg-1.onnx
7d495012dd5b7ba008143f0c9cb52f3fd97ab0f208923d6a03be0e7db0cd4a4d encoder-epoch-99-avg-1.onnx
(py38) fangjuns-MacBook-Pro:sherpa-onnx-zipformer-en-2023-04-01 fangjun$ ls -lh encoder-epoch-99-avg-1.onnx
-rw-r--r-- 1 fangjun staff 337M Apr 2 17:43 encoder-epoch-99-avg-1.onnx
(py38) fangjuns-MacBook-Pro:sherpa-onnx-zipformer-en-2023-04-01 fangjun$ ls -l encoder-epoch-99-avg-1.onnx
-rw-r--r-- 1 fangjun staff 353667745 Apr 2 17:43 encoder-epoch-99-avg-1.onnx
Please make sure your output matches the above output.
I suspect that you are using a streaming zipformer but somehow for some unknown reason you put them incorrectly in the folder sherpa-onnx-zipformer-en-2023-04-01/
.
(venv) PS D:\004-Workspace\pycharm\sherpa> CertUtil -hashfile ./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx SHA256 SHA256 的 ./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx 哈希: 709f0ed53a734b7942f170127e7547b566cb29c4afc5e67719f314c3d63ccb10
I use the model: csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English) can i use the bilingual-chinese-english model in offline-decode-files.py
can i use the bilingual-chinese-english model in offline-decode-files.py
No, that model is a streaming model, which can only be used for online decoding.
Please use https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/online-decode-files.py
ok,thank you so much~
AssertionError: sample-rate=16000 does not exist!
输入语音文件只能16000采样吗
AssertionError: sample-rate=16000 does not exist!
输入语音文件只能16000采样吗
你把你使用的完整命令,贴出来。
这边保存文件是这个采样 Frame rate: 48000
运行转录文本报错:
python offline-decode-files.py --tokens=C:\Users\loong\Downloads\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\tokens.txt --paraformer=C:\Users\loong\Downloads\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\model.onnx --num-threads=2 --decoding-method=greedy_search --debug=True sample-rate=16000 feature-dim=80 audio.wav
请帖完整的 error log
请帖完整的 error log
需要提前wav文件转换采样率才能用吗
Started!
Traceback (most recent call last):
File "D:\project\MeloTTS\offline-decode-files.py", line 472, in <module>
main()
File "D:\project\MeloTTS\offline-decode-files.py", line 442, in main
assert_file_exists(wave_filename)
File "D:\project\MeloTTS\offline-decode-files.py", line 290, in assert_file_exists
assert Path(filename).is_file(), (
AssertionError: sample-rate=16000 does not exist!
Please refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it
sample-rate=16000 feature-dim=80 audio.wav
是你自己命令输错了
你把
--sample-rate
改成了
sample-rate
你漏掉了 --
.
sample-rate=16000 feature-dim=80 audio.wav
是你自己命令输错了
你把
--sample-rate
改成了
sample-rate
你漏掉了
--
.
嗯是漏了,但填上怎么没有识别结果输出?
这些都没有打印
results = [s.result.text for s in streams]
end_time = time.time()
print(results)
print("Done!")
for wave_filename, result in zip(args.sound_files, results):
print(f"{wave_filename}\n{result}")
print("-" * 10)
还有这种情况保存,换个wav文件
Started!
Traceback (most recent call last):
File "D:\project\MeloTTS\offline-decode-files.py", line 473, in <module>
main()
File "D:\project\MeloTTS\offline-decode-files.py", line 443, in main
samples, sample_rate = read_wave(wave_filename)
File "D:\project\MeloTTS\offline-decode-files.py", line 312, in read_wave
assert f.getsampwidth() == 2, f.getsampwidth() # it is in bytes
AssertionError: 4
File "D:\project\MeloTTS\offline-decode-files.py", line 473, in <module>
main()
File "D:\project\MeloTTS\offline-decode-files.py", line 443, in main
samples, sample_rate = read_wave(wave_filename)
File "D:\project\MeloTTS\offline-decode-files.py", line 310, in read_wave
with wave.open(wave_filename) as f:
File "C:\Users\loong\.conda\envs\nlp\lib\wave.py", line 509, in open
return Wave_read(f)
File "C:\Users\loong\.conda\envs\nlp\lib\wave.py", line 163, in __init__
self.initfp(f)
File "C:\Users\loong\.conda\envs\nlp\lib\wave.py", line 130, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
我们只支持 wave 格式。请自己阅读 python 代码,自己调试。
如果你用的是我们的测试音频,没有结果输出,我们可以看。
如果是你自己调用的问题,或者使用错误的音频格式,这个需要你自己解决。
如果你用的是我们的测试音频,没有结果输出,我们可以看。
如果是你自己调用的问题,或者使用错误的音频格式,这个需要你自己解决。
嗯嗯好的
sample-rate=16000 feature-dim=80 audio.wav
是你自己命令输错了 你把
--sample-rate
改成了
sample-rate
你漏掉了
--
.嗯是漏了,但填上怎么没有识别结果输出?
这些都没有打印
results = [s.result.text for s in streams] end_time = time.time() print(results) print("Done!") for wave_filename, result in zip(args.sound_files, results): print(f"{wave_filename}\n{result}") print("-" * 10)
但这个正常音频文件没有报错信息能帮看下原因吗,怎么发出来wav看看?
这是音频压缩zip
请把你调用的命令发出来
这个模型 comment=speech_seaco_paraformer_large_asr_nat-zh-cantonese-en-16k-common-vocab11666-pytorch
python offline-decode-files.py --tokens=C:\Users\loong\Downloads\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\tokens.txt --paraformer=C:\Users\loong\Downloads\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\sherpa-onnx-paraformer-trilingual-zh-cantonese-en\model.onnx --num-threads=2 --decoding-method=greedy_search --debug=True --sample-rate=16000 --feature-dim=80 audio123.wav
@csukuangfj 另外这两个wav按照1个频道,16000采样还是识别不了,没有报错,开始start后就没有后续直接结束了程序;这边测试了下其他工具可以识别
我这里是可以的。你那里识别不出来,我就不知道了。
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition 这个 huggingface space, 也能识别出来。你找找你自己的原因吧。
我这里是可以的。你那里识别不出来,我就不知道了。
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition 这个 huggingface space, 也能识别出来。你找找你自己的原因吧。
嗯,这么奇怪,我再看看,多谢
debug出来,大概是到这个函数出错,但这个函数就是没有报错信息,也不好进去查看具体实现细节,现在版本升级到最新版也不行
在安装的包里没有找到_sherpa_onnx相关文件
from _sherpa_onnx import (
OfflineRecognizerConfig,
OfflineStream,
在linux上运行,直接报段错误
你的 offline_file.py 如何得到的?
你用我们提供的代码,不做任何修改,是否有问题?
你的 offline_file.py 如何得到的?
你用我们提供的代码,不做任何修改,是否有问题?
完全用的官方代码,https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/offline-decode-files.py
现在不知道是安装包,环境冲突还是,暂时没发现具体产生问题
你用 c++ 编译出来的二进制 sherpa-onnx-offline
去试试?
Hello,
I'm trying to use the sherpa-onnx Python API to transcribe audio files with the zipformer model. However, I'm encountering an error indicating a dimension mismatch between the input data and the model's expectations.
Here is the command I'm running:
python offline-decode-files.py \ --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \ --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \ --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \ --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \ ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \ ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav \ ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav
And here is the error message I'm getting: Started! D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-stream.cc:AcceptWaveformImpl:108 Creating a resampler: in_sample_rate: 8000 output_sample_rate: 16000 Traceback (most recent call last): File "offline-decode-files.py", line 340, in
main()
File "offline-decode-files.py", line 319, in main
recognizer.decode_streams(streams)
File "D:\004-Workspace\pycharm\sherpa\venv\lib\site-packages\sherpa_onnx\offline_recognizer.py", line 242, in decode_streams
self.recognizer.decode_streams(ss)
RuntimeError: Got invalid dimensions for input: x for the following indices
index: 1 Got: 1764 Expected: 39
Please fix either the inputs or the model.
From the error message, it seems like the input data's dimensions don't match what the model is expecting. However, I'm not sure why this is the case, as I'm using the provided offline-decode-files.py script and the test WAV files and your models
I would greatly appreciate any insights or advice on how to resolve this issue. Thank you in advance for your help!
here is my env: python ==3.8 (venv) PS D:\004-Workspace\pycharm\sherpa> pip list Package Version
numpy 1.24.4 pip 22.3.1 sentencepiece 0.1.96 setuptools 65.5.1 sherpa-onnx 1.5.5 wheel 0.38.4