llSourcell / tensorflow_speech_recognition_demo

This is the code for 'How to Make a Simple Tensorflow Speech Recognizer' by @Sirajology on Youtube
383 stars 250 forks source link

missing predict.py #29

Open linksyncjameshwartlopez opened 6 years ago

linksyncjameshwartlopez commented 6 years ago

I wonder where is predict.py file that was mention in https://www.youtube.com/watch?v=u9FPqkuoEJ8 at 6:41 of the video

hcchengithub commented 6 years ago

There's none such file. But it can be done manually:

model.load(pathname) 
y, sr = librosa.load("PathName.wav",mono=True)
mfcc = librosa.feature.mfcc(y,sr) 
MFCC = np.pad(mfcc,((0,0),(0,80-len(mfcc[0]))),mode='constant',constant_values=0) 
model.predict([MFCC]) 

https://github.com/hcchengithub/tensorflow_speech_recognition_demo

linksyncjameshwartlopez commented 6 years ago

Should it be something like this. Im only guessing because I don't how to build up model variable. kindly correct me if im wrong.

`from future import division, print_function, absolute_import import tflearn import speech_data import tensorflow as tf

learning_rate = 0.0001 training_iters = 300000 # steps batch_size = 64

width = 20 # mfcc features height = 80 # (max) length of utterance classes = 10 # digits

batch = word_batch = speech_data.mfcc_batch_generator(batch_size) X, Y = next(batch) trainX, trainY = X, Y testX, testY = X, Y #overfit for now

net = tflearn.input_data([None, width, height]) net = tflearn.lstm(net, 128, dropout=0.8) net = tflearn.fully_connected(net, classes, activation='softmax') net = tflearn.regression(net, optimizer='adam', learning_rate=learning_rate, loss='categorical_crossentropy')

col = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) for x in col: tf.add_to_collection(tf.GraphKeys.VARIABLES, x )

model = tflearn.DNN(net, tensorboard_verbose=0) model.load(pathname) y, sr = librosa.load("PathName.wav",mono=True) mfcc = librosa.feature.mfcc(y,sr) MFCC = np.pad(mfcc,((0,0),(0,80-len(mfcc[0]))),mode='constant',constant_values=0) model.predict([MFCC])`

hcchengithub commented 6 years ago

No, you need to know how to use tflearn's model.save() and model.load() to save-restore your trained networks. You need to complete training your model, get the saved network so you don't need to do the training every time when you only want to try some predicts. They are not difficult. Just do it correctly and you can make it. My snippet assumes that you have the saved network already and start from there shows you how to do the prediction.

linksyncjameshwartlopez commented 6 years ago

Thanks @hcchengithub , one last question how do you instantiate model variable in your code snippet? I've tried to look at it and i arrive on this https://stackoverflow.com/questions/45099608/tflearn-issue-over-model-load?answertab=active#tab-top.

Or am i doing it wrong. Im so naive that i don't know how to load previously saved model of tflearn. Hope you can help me with this :). i'll try to dig more if i can find answers myself, it seems to be so obvious to the person who already done loading of a saved model but for a person who haven't done it yet is really vague.

hcchengithub commented 6 years ago

This demo is not really suitable for totally new comers. It's uncompleted ,incorrect and useless, really. Don't waste your time here. For general AI learning, I suggest http://www.hvass-labs.org/ or Google developers Josh Gordon's {Machine Learning} Recipes For speech recognition, jump to Mozilla DeepSpeech.

mohsin671 commented 6 years ago

Hi @hcchengithub , were you able to run the code? I am having this error: ValueError: At least two variables have the same name: FullyConnected/W