snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Other
5k stars 316 forks source link

Bug report - running on ARM / RPI #70

Closed Salim-alileche closed 3 years ago

Salim-alileche commented 3 years ago

🐛 Bug

I tried to use the model in a Raspberry PI 3B and i get the following error : fft: ATen not compiled with MKL support So i tried to modify the stft function in torch/functional.py to use the librosa stft instead, but it seems that the model use another torch stft instead of this i have on my package.

The function used instead of torch stft

def stft(input: Tensor, n_fft: int, hop_length: Optional[int] = None, win_length: Optional[int] = None, window: Optional[Tensor] = None, center: bool = True, pad_mode: str = 'reflect', normalized: bool = False, onesided: Optional[bool] = None, return_complex: Optional[bool] = None): S = librosa.stft(np.array(input),n_fft,hop_length,win_length,window,center,pad_mode) s_real = np.real(S) s_real_shape = np.shape(s_real) s_real = np.reshape(s_real,(s_real_shape[0],s_real_shape[1],1)) s_imag = np.imag(S) s_imag_shape = np.shape(s_imag) s_imag = np.reshape(s_imag,(s_imag_shape[0],s_imag_shape[1],1)) S = np.concatenate((s_real,s_imag),axis=2) return torch.tensor(S)

stack traces

File "/home/Salim/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/stt_pretrained/models/model.py", line 27, in forward _2 = self.win_length _3 = torch.hann_window(self.n_fft, dtype=ops.prim.dtype(x), layout=None, device=ops.prim.device(x), pin_memory=None) x0 = torch.torch.functional.stft(x, _0, _1, _2, _3, True, "reflect", False, True, )


    _4 = torch.slice(x0, 0, 0, 9223372036854775807, 1)
    _5 = torch.slice(_4, 1, 0, 9223372036854775807, 1)
  File "code/__torch__/torch/functional.py", line 21, in stft
    input0 = input
  print("test ok")
  _2 = torch.stft(input0, n_fft, hop_length, win_length, window, normalized, onesided)
       ~~~~~~~~~~ <--- HERE
  return _2

Traceback of TorchScript, original code (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/functional.py", line 465, in stft
        input = F.pad(input.view(extended_shape), (pad, pad), pad_mode)
        input = input.view(input.shape[-signal_dim:])
    return _VF.stft(input, n_fft, hop_length, win_length, window, normalized, onesided)
           ~~~~~~~~ <--- HERE
RuntimeError: fft: ATen not compiled with MKL support

## Expected behavior

Is it possible to modify the forward function that it will use the librosa stft for the raspberry PIs users ?

## Environment

PyTorch version: 1.7.0a0+e85d494
Is debug build: True
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Raspbian GNU/Linux 10 (buster) (armv7l)
GCC version: (Raspbian 8.3.0-6+rpi1) 8.3.0
Clang version: Could not collect
CMake version: version 3.13.4

Python version: 3.7 (32-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] numpydoc==0.7.0
[pip3] torch==1.7.0a0
[pip3] torchaudio==0.7.0a0+ac17b64
[pip3] torchvision==0.8.0a0+291f7e2
[conda] Could not collect
snakers4 commented 3 years ago

Hi,

Many thanks for reporting this.

fft: ATen not compiled with MKL support

I have heard reports of people actually building their PyTorch with cblas or something instead of MKL and it running via PyTorch on the RPI, but I have not tried it myself:

image image

Alas no build instructions were published yet - https://github.com/snakers4/silero-vad/issues/37. As usual - community is encouraged to make their dockerized builds public (to be reproducible).

Also have you tried the onnx version? It should be easier to run it, because there stft is replaced out of the box by a hand made stft function.

Is it possible to modify the forward function that it will use the librosa stft for the raspberry PIs users ?

Technically, there is no problem. But I would not like to have 2 versions of the same model for torch / onnx - with and without the frontend. I like to keep the models nice and fully packaged. Also there may be a problem building librosa then ...

snakers4 commented 3 years ago

[pip3] numpy==1.20.2 [pip3] numpydoc==0.7.0 [pip3] torch==1.7.0a0 [pip3] torchaudio==0.7.0a0+ac17b64 [pip3] torchvision==0.8.0a0+291f7e2

Also technically when you are doing the edge builds, you can omit torchaudio, torchvision, numpy (I guess). Some of these are present just for illustration purposes, I believe. You just need to fiddle a bit with model initialization code and utils.

snakers4 commented 3 years ago

btw, you can ask @leoplusplus in telegram here https://t.me/silero_speech - these are his comments maybe he will publish his builds after all =)

Salim-alileche commented 3 years ago

Hi,

Thank you for your advices, I will try build pytorch with cblas if it doesn't work I will try the onnx version.

snakers4 commented 3 years ago

Hi,

Any luck with these builds?

Salim-alileche commented 3 years ago

Hi,

sorry for the late reply, I failed to build torch so I switched to the onnx model and it works perfectly.

snakers4 commented 3 years ago

Did you have to do any special builds for onnx-runtime or did it just work out of the box from pre-built binaries for ARM? If you did, could you please share your dockerized build?

Salim-alileche commented 3 years ago

For onnx-runtime there is some pre-built wheel for raspberry 3 here, the wheels are built with this procedure.

snakers4 commented 3 years ago

I see, nice to have this link and someone verifying that it is working