Open jiangj-dc opened 2 years ago
Does the CPU decoding use Intel MKL?
For the neural network part, it is using PyTorch to do computation. And it depends on whether PyTorch is using MKL or not. I think there are options in PyTorch to support parallel computation on CPU.
For the FSA decoding part, there are no linear algebra operations and MKL does not play a role here. Also, it processes each utterance in a batch sequentially on CPU.
Here is a speed comparison using CPU. Any suggestion to improve decoding speed? Thanks.
The time taken in k2 would be most strongly affected by the beam and max_active-states (most likely), those would be the first things to tune.
The speed is now comparable if search_beam = 10, max_active_states = 1000, without WER degradation. Alternatively, if BPE vocal_size decreases from 1000 to 500, the speed is also good. Thanks!
Great! But the speed is now comparable to what?
The speed was compared to the example Librispeech (960 hours) experiment, where the k2 RTF was about 0.07 (the first row in the above table). For a different dataset, the k2 RTF was about 0.38 (the second row in the above table) and is now 0.07.
Does the CPU decoding use Intel MKL? Is there an option to do parallel decoding?