Hi Mr. Rolczynski,
I have tried to generate a new model with a new training dataset(code is as below which is same as you have mentioned in github)
dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121123-dev.csv', batch_size=25)
dev_dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121550-dev.csv', batch_size=25)
The training resulted in some files, which are as below in the checkpoint directory
'alphabet.bin, , 'decoder.bin', 'feature_extractor.bin', and 'model.h5'
But my question is how to load the model which i have just created. I believe the code which you have provided to test a pre -trained model (below) works only with deep speech model and not my own model.
file = 'to/test/sample.wav' # sample rate 16 kHz, and 16 bit depth
sample = asr.utils.read_audio(file)
pipeline = asr.load('deepspeech2', lang='en')
pipeline.model.summary() # TensorFlow model
sentences = pipeline.predict([sample])
Can you please help me to resolve this. I really appreciate your effort in helping the larger audience to get the knowledge of how speech to text recognition works.
Hi Mr. Rolczynski, I have tried to generate a new model with a new training dataset(code is as below which is same as you have mentioned in github) dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121123-dev.csv', batch_size=25) dev_dataset = asr.dataset.Audio.from_csv('C:/Users/XXXXX/Automatic-Speech-Recognition-master/84-121550-dev.csv', batch_size=25)
alphabet = asr.text.Alphabet(lang='en') features_extractor = asr.features.FilterBanks( features_num=160, winlen=0.02, winstep=0.01, winfunc=np.hanning ) model = asr.model.get_deepspeech2( input_dim=160, output_dim=29, rnn_units=800, is_mixed_precision=False ) optimizer = tf.optimizers.Adam( lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-8 ) decoder = asr.decoder.GreedyDecoder() pipeline = asr.pipeline.CTCPipeline( alphabet, features_extractor, model, optimizer, decoder ) pipeline.fit(dataset, dev_dataset, epochs=25) pipeline.save('C:/Users/XXXX/Automatic-Speech-Recognition-master/Automatic-Speech-Recognition-master/automatic_speech_recognition/checkpoint/')
The training resulted in some files, which are as below in the checkpoint directory 'alphabet.bin, , 'decoder.bin', 'feature_extractor.bin', and 'model.h5'
But my question is how to load the model which i have just created. I believe the code which you have provided to test a pre -trained model (below) works only with deep speech model and not my own model.
file = 'to/test/sample.wav' # sample rate 16 kHz, and 16 bit depth sample = asr.utils.read_audio(file) pipeline = asr.load('deepspeech2', lang='en') pipeline.model.summary() # TensorFlow model sentences = pipeline.predict([sample])
Can you please help me to resolve this. I really appreciate your effort in helping the larger audience to get the knowledge of how speech to text recognition works.