Closed zyjcsf closed 1 year ago
Hi, @zyjcsf, it takes around 20 minutes to finish the whole evaluation on the LRS3 test set (0.9 hours) on GPU, so the real time factor (RTF) of our model should not be high. As our lipreading models are trained in a non-streaming scenario, it requires some modifications to the model (train in a streaming fashion) or the evaluation process to achieve prediction in real-time or near real-time. Please check the following steps to use an offline lip-reading model for the purpose of streaming prediction.
Hi , I‘m a beginner in lipreading. I'm curious how low the latency of lip recognition can be? Is there any solution to reduce the delay? Thank you very much.