pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.54k stars 654 forks source link

Exporting the operator stft to ONNX opset version 9 is not supported. #982

Open lawlict opened 4 years ago

lawlict commented 4 years ago

Hi, I try exporting the process of feature extraction to onnx:

import torch
import torchaudio

model = torchaudio.transforms.MelSpectrogram()
x = torch.randn(1, 16000)
torch.onnx.export(model, x, 'tmp.onnx', input_names=['input'], output_names=['output'])

and get:

RuntimeError: Exporting the operator stft to ONNX opset version 9 is not supported. Please open a bug to request ONNX export support for the missing operator.

So will torchaudio add supports operators used in torchaudio.transforms module in the future? You see that exporting the process of feature extraction and the neural network together will be very convenient.

Thanks!

vincentqb commented 4 years ago

Thanks for opening an issue for this :) This is an issue upstream with pytorch directly, see pytorch/pytorch#31317. @houseroad, do you know the status of STFT support with ONNX?

We can keep this issue open since this does affect torchaudio.

mthrok commented 4 years ago

Hi @lawlict

We are hearing the increased number of support requests for stft/istft in ONNX. The core of the problem is stft/istft not defined in ONNX. So we passed down the user voices to the proper channel to see what we can do. At this moment, we do not have a concrete action plan. Meanwhile one workaround you can do is to mock the stft function in a similar way as https://github.com/pytorch/pytorch/issues/31317#issuecomment-670624730 and patch the call to torch.stft.

Let us know if you have a proposal for a way to resolve this.

lawlict commented 4 years ago

Hi @mthrok Thank you for your nice help, and I will have a try on pytorch/pytorch#31317 (comment) later. So should I close the issue now?

mthrok commented 4 years ago

@lawlict

Let's keep the issue. This is a very unique issue and torchaudio does not have an official stance on this one, so until we hear the update and decide the stance we can keep this issue open and use it as a point of contact.

faroit commented 4 years ago

@lawlict just as a quick fix you can try nnaudio, torch-stft or asteroids stftfb

lawlict commented 4 years ago

Frustrating news is that the speed becomes much slower when replacing FFT with Fourier matrix...

mthrok commented 4 years ago

Sorry to hear that. I'll follow up with the team.

mthrok commented 4 years ago

So I was told that there is no plan to add stft/istft functions to ONNX at the moment. The best way would be to write a proposal following their guide This seems to require significant amount of efforts. Due to my workload, I cannot take an initiative on this one though I think this is very valuable work. We can do some research on fulfilling some of the requirements for the spec. (but it might be more appropriate to do so in their issue thread for better visibility.)

chenjiasheng commented 2 years ago

Any update fow now?

NiziL commented 2 years ago

@chenjiasheng There is actually some good news around : pytorch/pytorch#65666 and onnx/onnx#3741

averkij commented 2 years ago

PR is finally merged. https://github.com/onnx/onnx/pull/3741

stonelazy commented 2 years ago

These signal processing operators (STFT/FFT/IFFT) are supported in opset 17.
I was trying to export the torch module to ONNX, but I was thrown an error. Does torch need to support this even after MR is being merged in ONNX ?

import torch
import torch.nn as nn

class FeaturizeTorchFFT(nn.Module):

    def __init__(self):
        super().__init__()

    def forward(self, x):
        return torch.stft(
            input=x,
            n_fft=320,
            hop_length=160,
            win_length=320,
            window=torch.ones(320),
            center=False,
            pad_mode="reflect",
            normalized=False,
            onesided=True,
        )
self.featurizer_model = FeaturizeTorchFFT()
torch.onnx.export(
            self.featurizer_model, 
            self.sample_input, 
            # 're_onnx_model.onnx',
            str(self.output_path),
            export_params=True,
            opset_version=17,
            do_constant_folding=True,
            input_names=["x"],
            output_names=["output"],
            dynamic_axes={
                "x": {0: "batch_size", 1: "audio_length"},
            },
        )

  File "/home/kp/miniconda3/envs/gamd6-kp2/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py", line 853, in _set_opset_version
    raise ValueError("Unsupported ONNX opset version: " + str(opset_version))
ValueError: Unsupported ONNX opset version: 17
averkij commented 2 years ago

Operators are supported in opset 17. But opset 17 itself is not supported by PyTorch yet.

stonelazy commented 2 years ago

Operators are supported in opset 17. But opset 17 itself is not supported by PyTorch yet.

Thanks for the quick help, if possible could you actually show the direction on how I can actually test the working of these operators standalone (without torch) ? Am jst trying to ensure the output of torch.fft is the same as the one from onnx

mravanelli commented 2 years ago

Hi, is there any news on this issue? This feature is going to be important for the SpeechBrain project. Some of our contributors are working on exporting our speech processing pipelines into a microcontroller. This requires the model to be ONNX exportable/importable (@fpaissan). The main issue is that all the feature extraction relies on fft and stft.

mthrok commented 2 years ago

This is PyTorch (MSFT) issue rather than TorchAudio. Please check and cast voice at https://github.com/pytorch/pytorch/issues/81075 It seems the support for STFT is in motion.

njb commented 1 year ago

While we wait for native PyTorch integration, folks can checkout https://github.com/adobe-research/convmelspec which will implement (Mel)spectrograms via 1D conv layer to both ONNX and CoreML (and as a second option CoreML MIL ops).

Xavier-i commented 1 year ago

While we wait for native PyTorch integration, folks can checkout https://github.com/adobe-research/convmelspec which will implement (Mel)spectrograms via 1D conv layer to both ONNX and CoreML (and as a second option CoreML MIL ops).

very cool solution, you guys should publish it to pypi!

wrz1999 commented 10 months ago

May I ask if PSD and SoudenMVDR in torchaudio support conversion to onnx or libtorch?

WangHHY19931001 commented 3 weeks ago

https://github.com/adobe-research/convmelspec

it will give some help?

Archie3d commented 2 weeks ago

https://github.com/adobe-research/convmelspec

it will give some help?

Seems to work when using spec_mode='DFT', the default torchaudio mode still does not.