Further development | Advice on audio processing

Is there any further development planned? I find it interesting to have a reasonable audio augmentations / features generation library accelerated with JAX.

The most common library, audiomentations, runs purely on CPU, where it has sister pytorch GPU-boosted library, torch-audiomentations. Torchaudio also has quite a great support for CPU/GPU with many ops included.

For Tensorflow I could only find tfio.audio with minimal usefulness (mespectrogram generation + specaugment and trim, not too many wave augmentations).

In my use case I would like to waveform augment dataset in offline fashion and dump it into tfrecords as melspectrograms. Then I would load them into Tensorflow which I currently use in my project and apply there melspectrogram augmentations on-fly. At this moment I could use audiomentations with some multiprocessing and do all waveform augments on CPU, and then use tfrecords and tf.io while training with Tensorflow.

Could you share how you approach such problems? Is there any widely supported audio-processing library for jax/tensorflow ecosystem?

google-deepmind / dm_aux

Further development | Advice on audio processing #1