-
Could you please tell us how you generated mel spectrograms for training from .wav files? What were the parameters used?
-
@Rayhane-mamah I know you have substituted CBHG for convolution for the post net. But the evaluation results sound no better than the 1st generation. There is analysis from @rafaelvalle https://github…
-
Could you please elaborate why you have not used Learnable_window in STFT , Mel Spectrograms and MFCC but used in their inverse counterparts?
-
Thank you for open sourcing this great work!
One of the great advantages I see in vocoders operating in the time domain is how easy it is to combine the vocoding task with superresolution. You just…
-
**Package**
lib-classifier
**Feature or Issue Description**
We need an audio and image subject viewer, e.g. for spectrograms. This should be composable from other subject viewers.
~~There…
-
Can we provide our own speaker embeddings to produce mel spectrograms using flowtron rather than use the speaker embeddings generated by flowtron? If yes, how should we normalize those embeddings?
-
Frequency (kHz) axes need fixing. Some recordings are showing pipistrell calls at 300 kHz, others at 450 kHz, etc. Scales of spectrograms are different and messed up for each recording.
-
Based on the works of Palanisamy et al (https://arxiv.org/pdf/2007.11154.pdf), use an ImageNet Pre-Trained DenseNet for classification on the Urban Sound 8k dataset. Based on the empirical founding th…
-
-
from __future__ import print_function
from hyperparams import Hyperparams as hp
import numpy as np
from data_load import load_data
import tensorflow as tf
from train import Graph
from utils im…