ml-explore / mlx

MLX: An array framework for Apple silicon
https://ml-explore.github.io/mlx/
MIT License
15.91k stars 905 forks source link

[Feature] Support Short-Time Fourier Transforms #1004

Open Rifur13 opened 3 months ago

Rifur13 commented 3 months ago

Short-Time Fourier Transforms (STFT) and its inverse are used extensively in signal processing.

Function definition will match scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.stft.html https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.istft.html

Rifur13 commented 3 months ago

You can assign this to me.

awni commented 3 months ago

Cool.. if we include this, I wonder where it would go? I'm not opposed to including it in core or a sub-package but I think there is a decision to be made there.

I see jax implemented it in a scipy.signal subpackage https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.signal.stft.html, and PyTorch implemented it at the top level https://pytorch.org/docs/stable/generated/torch.stft.html

Also our Whisper implementation already has an STFT in MLX you might find useful: https://github.com/ml-explore/mlx-examples/blob/main/whisper/whisper/audio.py#L104-L127

Rifur13 commented 3 months ago

I think adding them in core is best - they’re fundamental algorithms and used frequently. A separate signal sub-package makes sense if you anticipate adding more functions from scipy.signal in he near future. (https://jax.readthedocs.io/en/latest/jax.scipy.html#module-jax.scipy.signal)

Thanks for pointing me to the Whisper implementation, I haven't seen in yet

tqtifnypmb commented 2 months ago

One tiny issue about stft in mlx-example/whisper, the noverlap parameter does not actually represent the overlap of frames; instead, it is semantically equivalent to hop_length in torch.stft.

I found it a little bit misleading while implementing istft.

awni commented 2 months ago

Is it not the same as noverlap in the Scipy implementation? https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.stft.html

I think that's a good reference to follow.

Rifur13 commented 2 months ago

Still working on this CL, shouldn't be much longer.

t = (x.size - nperseg + noverlap) // noverlap does look wrong though, maybe that's where the confusion is. The denominator should be the number of steps.

tqtifnypmb commented 2 months ago

Is it not the same as noverlap in the Scipy implementation? https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.stft.html

I think that's a good reference to follow.

No.

According to this hop_length = nperseg - noverlap

In stft in mlx-example/whisper noverlap is being used as hop_length. It appears to be a naming mistake, the logic looks good to me.

tqtifnypmb commented 2 months ago

Still working on this CL, shouldn't be much longer.

Cool.

I was implementing a istft specific to my problem, not a general one.