zzw922cn / Automatic_Speech_Recognition

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
MIT License
2.84k stars 538 forks source link

Provides PER calculation based on numpy arrays and has the ability to merge phonemes from 61 to 39 #15

Closed brianlan closed 7 years ago

brianlan commented 7 years ago

Sample usage:

from utils.PER_merge_phn import SparseTensor, calc_PER

_, batch_loss, batch_pred = sess.run([optimizer, loss_train, pred_train], feed_dict={...})

per = calc_PER(SparseTensor(batch_pred.indices, batch_pred.values, batch_pred.dense_shape),
               SparseTensor(batch_truth_indices, batch_truth_vals, batch_truth_shape))
zzw922cn commented 7 years ago

@brianlan hi, sorry to see it now. Have you tried to apply this code into main program?

brianlan commented 7 years ago

@zzw922cn Yes, I've tried it on the main program and works fine. Take timit_train.py as example, we can simply import it

from utils.PER_merge_phn import SparseTensor, calc_PER

and then change

er = get_edit_distance([pre.values], [y.values], True, level)

to

er = calc_PER(SparseTensor(pre.indices, pre.values, pre.dense_shape),
              SparseTensor(y.indices, y.values, y.dense_shape))