k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.55k stars 415 forks source link

请问python脚本下usb声卡语音不能识别,怎么解决 #1035

Closed chenqy2018 closed 4 months ago

chenqy2018 commented 4 months ago
  1. 自己使用python测试usb声卡录音是正常的,部分代码如下: with sd.InputStream(samplerate=sample_rate, channels=channels, device=deviceindex) as stream: print("录音开始...") for in range(int(sample_rate * duration / samples_per_read)): data, overflowed = stream.read(samples_per_read) if overflowed: print("缓冲区溢出!")

    将采集的数据转换为 bytes 并写入文件

    # 将 float32 数据转换为 int16,然后转换为 bytes 并写入文件
    int_data = (np.array(data) * 32767)).astype(np.int16)
    wf.writeframes(int_data.tobytes())

    print("录音结束")

  2. 使用上面python生成的test.wav给python-api-examples/online-decode-files.py测试也是正常识别出文字的;

  3. 使用speech-recognition-from-microphone.py测试打印Started! Please speak后没有反应;

  4. 修改speech-recognition-from-microphone.py如下代码: sample_rate = 16000 samples_per_read = int(0.1 * sample_rate) # 0.1 second = 100 ms last_result = "" stream = recognizer.create_stream() with sd.InputStream(channels=1, dtype="float32", samplerate=sample_rate,device=1) as s: while True: samples, overflowed = s.read(samples_per_read) # a blocking read if overflowed: print("缓冲区溢出!") samples = samples.reshape(-1)

运行代码如下: Started! Please speak 缓冲区溢出! 缓冲区溢出! 缓冲区溢出! 缓冲区溢出! 缓冲区溢出! ...... ^C Caught Ctrl + C. Exiting 请问下如何处理?

csukuangfj commented 4 months ago

使用上面python生成的test.wav给python-api-examples/online-decode-files.py测试也是正常识别出文字的;

RTF 是多少?

如果你不知道什么是 RTF, 请跑 ./build/bin/sherpa-onnx

chenqy2018 commented 4 months ago

Real time factor (RTF): 12.949/10.053 = 1.288 我使用的是rk3568,cpu处理不行?

csukuangfj commented 4 months ago

你要选一个模型,使得 RTF < 1.

不然处理不过来,肯定 overflow.

csukuangfj commented 4 months ago

https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english 可以试试这个模型,记得用 int8, 可以选用文件夹 96 下面的模型

chenqy2018 commented 4 months ago

好的,谢谢,请问有相关onnx转rknn介绍吗,需要注意哪些算子

csukuangfj commented 4 months ago

好的,谢谢,请问有相关onnx转rknn介绍吗,需要注意哪些算子

我们没有

Gooddz1 commented 2 months ago

好的,谢谢,请问有相关onnx转rknn介绍吗,需要注意哪些算子

你实现了吗?需要注意哪些算子

csukuangfj commented 2 months ago

好的,谢谢,请问有相关onnx转rknn介绍吗,需要注意哪些算子

你实现了吗?需要注意哪些算子

qq 群里有人实现了,你可以去问下。没有开源