Open keunwoochoi opened 4 years ago
Unrealistic requests:
1) A convenient data generator for large audio datasets. This could be a tf.keras.utils.Sequence similar to tf.keras.preprocessing.image.ImageDataGenerator or something compatible with the newer tf.data.Dataset.from_generator. Perhaps this could have access to some basic augmentations and mixup training.
2) More GPU accelerated augmentations. Currently, I use audiomentations in a custom Keras Sequence (particularly the AddBackgroundNoise augmentation), but it has a decent cpu bottleneck. I know those authors are working on a Pytorch implementation, though.
3) Basic TFlite compatibility. This might become unnecessary once tflite gets the Rfft op. There is an implementation that gets around that here. Also, the audio_microfrontend function in the experimental section of the tensorflow repo seems to be working. In any case, it would be nice to be able to tell the model to compile in a tflite-friendly manner if the model's ops are supported.
I think this one is already in progress, but here are some possible additional features:
I would like to have on GPU Continuous Wavelet Transform. Any idea? Just found this repo, but is for tensorflow. https://github.com/nicolasigor/cmorlet-tensorflow
I'd like to have mixed-precision compatibility. Currently, this fails on the layers, as they need float32 inputs, but mixed precision has FP16. (At least that is my experience)
please add a power exponential parameter for the logmelspectrogram. librosa has this, it's basically just an exponent of the amplitude layer, would be nice to have this baked in already :D thanks for the great work
I am currently working on STFT tflite compatibility, I have a branch in a fork that is working for this, I am just tidying up and adding unit tests then I can share.
Thanks all, and please keep commenting.
But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)
Thanks all, and please keep commenting.
But let me confess. I started to use Pytorch as a main DL library for my work and there are less time I can invest on Kapre at the moment. That said, I'm still watching Kapre and would love to do code-review or any quick fix :)
If you don't mind, could you elaborate why you decided to use PyTorch? If heard several reports from people switching that PyTorch is more intuitiv, but I'm looking forward to your answer.
@Yannik1337 Sure. TF doesn't tell me where is the error exactly, especially when it comes to a real experiment that involves tfrecords, tf.data.Data, customized metrics and loss functions, data preprocessing, etc. As a result, quite often I have to go through a clueless debugging process which is not so different from the first programming class (with C) in my life. With PyTorch, when there's a problem, it tells me what it is exactly and I can go check out and fix it accordingly which is, like, working with Python. I didn't like the lack of keras-equivalent library, but these days pytorch-lightning is mature enough.
@keunwoochoi Same story for me. Although, I'm still somewhat stuck with tensorflow for production due to tflite / tfmicro still being the best for edge devices.
I didn't know about pytorch lightning! Thanks for exposing me to that awesome wrapper!
I’m the same, I’m stuck with TF due to TFLITE, and edge platforms. I have not used pytorch, but from what I have read and my colleagues it seems much more intuitive.
I agree about tensorflow’s debug information, it’s not very helpful!
@kenders2000 , do you have any available code for this compatible version? I am currently in the process of converting a model that uses kapre
layers to the tflite
format, but this fails due to unsupported operations.
@Yannik1337 I added these tflite compatible layers to Kapre a while back:
from kapre import STFTTflite, MagnitudeTflite
The PR https://github.com/keunwoochoi/kapre/pull/131 includes some additional documentation as to how to use them, in summary:
You need to train a model with normal layers first, then create a new model with the tflite alternatives and load the trained weights into the model, this is because the tflite layers only support a batch size of 1, which is fine for inference on devices in most use cases but doesn't let you use them fro training,
Please leave me any feature request you'd like! Doesn't need to be realistic but just literally anything.