rolczynski / Automatic-Speech-Recognition

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
GNU Affero General Public License v3.0
223 stars 64 forks source link

How to test the model which I have generated with new dataset. #35

Open animeshkalita82 opened 3 years ago

animeshkalita82 commented 3 years ago

Hi Mr. Rolczynski, I have tried to generate a new model with a new training dataset(code is as below which is same as you have mentioned in github) dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121123-dev.csv', batch_size=25) dev_dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121550-dev.csv', batch_size=25)

alphabet = asr.text.Alphabet(lang='en') features_extractor = asr.features.FilterBanks( features_num=160, winlen=0.02, winstep=0.01, winfunc=np.hanning ) model = asr.model.get_deepspeech2( input_dim=160, output_dim=29, rnn_units=800, is_mixed_precision=False ) optimizer = tf.optimizers.Adam( lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-8 ) decoder = asr.decoder.GreedyDecoder() pipeline = asr.pipeline.CTCPipeline( alphabet, features_extractor, model, optimizer, decoder ) pipeline.fit(dataset, dev_dataset, epochs=25) pipeline.save('C:/Users/XXXX/Automatic-Speech-Recognition-master/Automatic-Speech-Recognition-master/automatic_speech_recognition/checkpoint/')

The training resulted in some files, which are as below in the checkpoint directory 'alphabet.bin, , 'decoder.bin', 'feature_extractor.bin', and 'model.h5'

But my question is how to load the model which i have just created. I believe the code which you have provided to test a pre -trained model (below) works only with deep speech model and not my own model.

file = 'to/test/sample.wav' # sample rate 16 kHz, and 16 bit depth sample = asr.utils.read_audio(file) pipeline = asr.load('deepspeech2', lang='en') pipeline.model.summary() # TensorFlow model sentences = pipeline.predict([sample])

Can you please help me to resolve this. I really appreciate your effort in helping the larger audience to get the knowledge of how speech to text recognition works.

askinucuncu commented 3 years ago

Is there an explanatory document on how to use the produced model file?