Closed 12345data closed 3 years ago
There is obviously some overhead when running this for each sequence independently so doing the whole TF initialization only once and then running it for multiple sequences in a row should speed things up. Other than that I guess the main bottleneck is running DeepSpeech. I don't know if there is actually a DeepSpeech variant that is realtime capable, if so, replacing this and retraining the model could be an option.
I have tried with single TF initialiaztion. But, one of the main function that takes time is 'render_mesh_helper' in the file rendering.py. Can the things done in that function can be speed up using any other methods. The time taken to complete 'render_sequence_meshes' function is really high. Do you have any suggestions to reduce time of it?
You are right, this is not even part of the method but just the rendering of the meshes to a video sequence. I don't have any publicly available fast rendering method in mind right now
The time cost for a response based on the input audio is high. Can you provide some suggestions for making it more realtime? Thanks in Advance!