error in mel-scale computation that results in different outcomes in python and python3 & conversion to the decibel scale

domerin0 / rnn-speech

Character level speech recognizer using ctc loss with deep rnns in TensorFlow.

MIT License

77 stars 31 forks source link

Hi,

While reviewing your project code I came across a couple of things (see subj) that may benefit from correction.

BTW, is there a specific reason for the audio to be up-sampled to 22050 KHz while reading in? I may think of a few reasons (e.g. bi-linear filters distort high freq., to avoid imaging one needs to cut the band with low pass, which leaves a gap, or must be very sharp, i.e. high-order, etc.). However, that practice is rather unusual.

I'm new to Github. If you see that I'm doing things in a wrong way,- please, do not hesitate to point it out.

Best! AI

domerin0 / rnn-speech

error in mel-scale computation that results in different outcomes in python and python3 & conversion to the decibel scale #45