snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Other
5.01k stars 316 forks source link

Off-line STT ONNX example ❓ Questions / Help / Support #181

Closed andy8025 closed 2 years ago

andy8025 commented 2 years ago

❓ Questions and Help

Hi! First, thanks for making this awesome software available.

I am trying to modify the example shown here https://github.com/snakers4/silero-models#onnx to work without omegaconf and torch.hub.load(), just want to load the model directly from a .onnx file. But I'm stuck at the second to last line - using the decoder function or class. How do I properly use the decoder?


import onnx import torch import onnxruntime import utils

language = 'en'

_, decoder, utils = torch.hub.load(repo_or_dir='snakers4/silero-models', model='silero_stt', language=language)

onnx_model = onnx.load('en_v6_xlarge.onnx') onnx.checker.check_model(onnx_model) ort_session = onnxruntime.InferenceSession('en_v6_xlarge.onnx', providers=['CUDAExecutionProvider'])

test_files = ['speech_orig.wav']

batches = utils.split_into_batches(test_files, batch_size=10) input = utils.prepare_model_input(utils.read_batch(batches[0]))

onnx_input = input.detach().cpu().numpy() ort_inputs = {'input': onnx_input} ort_outs = ort_session.run(None, ort_inputs)

decoded = decoder(torch.Tensor(ort_outs[0])[0]) # need to replace this decoder = utils.Decoder() # with ??? print(decoded)`