keunwoochoi / kapre

kapre: Keras Audio Preprocessors
MIT License
923 stars 146 forks source link

request for requests #104

Open keunwoochoi opened 4 years ago

keunwoochoi commented 4 years ago

Please leave me any feature request you'd like! Doesn't need to be realistic but just literally anything.

Path-A commented 4 years ago

Unrealistic requests:

1) A convenient data generator for large audio datasets. This could be a tf.keras.utils.Sequence similar to tf.keras.preprocessing.image.ImageDataGenerator or something compatible with the newer tf.data.Dataset.from_generator. Perhaps this could have access to some basic augmentations and mixup training.

2) More GPU accelerated augmentations. Currently, I use audiomentations in a custom Keras Sequence (particularly the AddBackgroundNoise augmentation), but it has a decent cpu bottleneck. I know those authors are working on a Pytorch implementation, though.

3) Basic TFlite compatibility. This might become unnecessary once tflite gets the Rfft op. There is an implementation that gets around that here. Also, the audio_microfrontend function in the experimental section of the tensorflow repo seems to be working. In any case, it would be nice to be able to tell the model to compile in a tflite-friendly manner if the model's ops are supported.

Path-A commented 4 years ago

I think this one is already in progress, but here are some possible additional features:

  1. A SpecAugment Layer. Allow for a min value or spectrogram mean to mask the input. Allow for min/max number of masks to apply. Allow for min/max number of sequential bins for each mask. Add TimeWarping (potential tensorflow function here, PyTorch notebook example here).
vincenzodentamaro commented 4 years ago

I would like to have on GPU Continuous Wavelet Transform. Any idea? Just found this repo, but is for tensorflow. https://github.com/nicolasigor/cmorlet-tensorflow

Yannik1337 commented 4 years ago

I'd like to have mixed-precision compatibility. Currently, this fails on the layers, as they need float32 inputs, but mixed precision has FP16. (At least that is my experience)

DankMinhKhoa commented 4 years ago

please add a power exponential parameter for the logmelspectrogram. librosa has this, it's basically just an exponent of the amplitude layer, would be nice to have this baked in already :D thanks for the great work

kenders2000 commented 3 years ago

I am currently working on STFT tflite compatibility, I have a branch in a fork that is working for this, I am just tidying up and adding unit tests then I can share.

keunwoochoi commented 3 years ago

Thanks all, and please keep commenting.

But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)

Yannik1337 commented 3 years ago

Thanks all, and please keep commenting.

But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)

If you don't mind, could you elaborate why you decided to use PyTorch? If heard several reports from people switching that PyTorch is more intuitiv, but I'm looking forward to your answer.

keunwoochoi commented 3 years ago

@Yannik1337 Sure. TF doesn't tell me where is the error exactly, especially when it comes to a real experiment that involves tfrecords, tf.data.Data, customized metrics and loss functions, data preprocessing, etc. As a result, quite often I have to go through a clueless debugging process which is not so different from the first programming class (with C) in my life. With PyTorch, when there's a problem, it tells me what it is exactly and I can go check out and fix it accordingly which is, like, working with Python. I didn't like the lack of keras-equivalent library, but these days pytorch-lightning is mature enough.

Path-A commented 3 years ago

@keunwoochoi Same story for me. Although, I'm still somewhat stuck with tensorflow for production due to tflite / tfmicro still being the best for edge devices.

I didn't know about pytorch lightning! Thanks for exposing me to that awesome wrapper!

kenders2000 commented 3 years ago

I’m the same, I’m stuck with TF due to TFLITE, and edge platforms. I have not used pytorch, but from what I have read and my colleagues it seems much more intuitive.

I agree about tensorflow’s debug information, it’s not very helpful!

Yannik1337 commented 3 years ago

@kenders2000 , do you have any available code for this compatible version? I am currently in the process of converting a model that uses kapre layers to the tflite format, but this fails due to unsupported operations.

kenders2000 commented 3 years ago

@Yannik1337 I added these tflite compatible layers to Kapre a while back:

from kapre import STFTTflite, MagnitudeTflite

The PR https://github.com/keunwoochoi/kapre/pull/131 includes some additional documentation as to how to use them, in summary:

You need to train a model with normal layers first, then create a new model with the tflite alternatives and load the trained weights into the model, this is because the tflite layers only support a batch size of 1, which is fine for inference on devices in most use cases but doesn't let you use them fro training,