Open lawlict opened 4 years ago
Thanks for opening an issue for this :) This is an issue upstream with pytorch directly, see pytorch/pytorch#31317. @houseroad, do you know the status of STFT support with ONNX?
We can keep this issue open since this does affect torchaudio.
Hi @lawlict
We are hearing the increased number of support requests for stft/istft
in ONNX. The core of the problem is stft/istft
not defined in ONNX. So we passed down the user voices to the proper channel to see what we can do. At this moment, we do not have a concrete action plan. Meanwhile one workaround you can do is to mock the stft
function in a similar way as https://github.com/pytorch/pytorch/issues/31317#issuecomment-670624730 and patch the call to torch.stft
.
Let us know if you have a proposal for a way to resolve this.
Hi @mthrok Thank you for your nice help, and I will have a try on pytorch/pytorch#31317 (comment) later. So should I close the issue now?
@lawlict
Let's keep the issue. This is a very unique issue and torchaudio does not have an official stance on this one, so until we hear the update and decide the stance we can keep this issue open and use it as a point of contact.
@lawlict just as a quick fix you can try nnaudio, torch-stft or asteroids stftfb
Frustrating news is that the speed becomes much slower when replacing FFT with Fourier matrix...
Sorry to hear that. I'll follow up with the team.
So I was told that there is no plan to add stft/istft functions to ONNX at the moment. The best way would be to write a proposal following their guide This seems to require significant amount of efforts. Due to my workload, I cannot take an initiative on this one though I think this is very valuable work. We can do some research on fulfilling some of the requirements for the spec. (but it might be more appropriate to do so in their issue thread for better visibility.)
Any update fow now?
@chenjiasheng There is actually some good news around : pytorch/pytorch#65666 and onnx/onnx#3741
PR is finally merged. https://github.com/onnx/onnx/pull/3741
These signal processing operators (STFT/FFT/IFFT) are supported in opset 17.
I was trying to export the torch module to ONNX, but I was thrown an error. Does torch need to support this even after MR is being merged in ONNX ?
import torch
import torch.nn as nn
class FeaturizeTorchFFT(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return torch.stft(
input=x,
n_fft=320,
hop_length=160,
win_length=320,
window=torch.ones(320),
center=False,
pad_mode="reflect",
normalized=False,
onesided=True,
)
self.featurizer_model = FeaturizeTorchFFT()
torch.onnx.export(
self.featurizer_model,
self.sample_input,
# 're_onnx_model.onnx',
str(self.output_path),
export_params=True,
opset_version=17,
do_constant_folding=True,
input_names=["x"],
output_names=["output"],
dynamic_axes={
"x": {0: "batch_size", 1: "audio_length"},
},
)
File "/home/kp/miniconda3/envs/gamd6-kp2/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py", line 853, in _set_opset_version
raise ValueError("Unsupported ONNX opset version: " + str(opset_version))
ValueError: Unsupported ONNX opset version: 17
Operators are supported in opset 17. But opset 17 itself is not supported by PyTorch yet.
Operators are supported in opset 17. But opset 17 itself is not supported by PyTorch yet.
Thanks for the quick help, if possible could you actually show the direction on how I can actually test the working of these operators standalone (without torch) ? Am jst trying to ensure the output of torch.fft
is the same as the one from onnx
Hi, is there any news on this issue? This feature is going to be important for the SpeechBrain project. Some of our contributors are working on exporting our speech processing pipelines into a microcontroller. This requires the model to be ONNX exportable/importable (@fpaissan). The main issue is that all the feature extraction relies on fft and stft.
This is PyTorch (MSFT) issue rather than TorchAudio. Please check and cast voice at https://github.com/pytorch/pytorch/issues/81075 It seems the support for STFT is in motion.
While we wait for native PyTorch integration, folks can checkout https://github.com/adobe-research/convmelspec which will implement (Mel)spectrograms via 1D conv layer to both ONNX and CoreML (and as a second option CoreML MIL ops).
While we wait for native PyTorch integration, folks can checkout https://github.com/adobe-research/convmelspec which will implement (Mel)spectrograms via 1D conv layer to both ONNX and CoreML (and as a second option CoreML MIL ops).
very cool solution, you guys should publish it to pypi!
May I ask if PSD and SoudenMVDR in torchaudio support conversion to onnx or libtorch?
https://github.com/adobe-research/convmelspec
it will give some help?
https://github.com/adobe-research/convmelspec
it will give some help?
Seems to work when using spec_mode='DFT'
, the default torchaudio
mode still does not.
Hi, I try exporting the process of feature extraction to onnx:
and get:
So will torchaudio add supports operators used in
torchaudio.transforms
module in the future? You see that exporting the process of feature extraction and the neural network together will be very convenient.Thanks!