tensorflow / io

Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Apache License 2.0
705 stars 284 forks source link

Add Griffin Lim reconstruction to tfio.audio #1585

Open douglas125 opened 2 years ago

douglas125 commented 2 years ago

Hello;

I was wondering: why there is no option to perform Griffin Lim reconstruction using tfio, considering that the library wraps the computation of spectrogram and mel-spectrogram? This is extremely useful in speech and sound synthesis in general.

The feature is probably not too difficult to do by adapting this https://colab.research.google.com/github/timsainb/tensorflow2-generative-models/blob/master/7.0-Tensorflow-spectrograms-and-inversion.ipynb

The code is a bit rough and has a few mistakes but I managed to get a working version of it that ofc requires being rewritten properly: https://colab.research.google.com/drive/1I_rBcG4Ic0gHqpV16Ikjh1__NT-2wecr?usp=sharing

I might be able to contribute this if it is relevant and tfio is the appropriate place for this op (no promise on timeframe though).

yongtang commented 2 years ago

@douglas125 Contribution is definitely welcomed!

douglas125 commented 2 years ago

How can I turn this into a feature request to comply with the new feature flow?

yongtang commented 2 years ago

Updated the label of the issue as requested.

douglas125 commented 2 years ago

For what it's worth, I have a working implementation that I'm testing in https://github.com/douglas125/io/blob/feature/griffin-lim/tensorflow_io/python/ops/audio_ops.py

I needed some extra changes so that the result could be written to a wav/mp3 without saturation due to high reconstructed amplitude. I'm looking to write some tests and probably a few examples now (I have them locally).