-
Dear Author ,
Thank you for your amazing work.
I am really interested to inference your model on my custom dataset and currently I am unable to understand the role of the MFA and how to do inf…
-
Hi.
Did you publish the code for the Knowledge Distillation loss? I couldn't find it in the code.
If it is not there, could you please publish the code?
Thanks
-
E:\AudioText\CycleLip-Project-main\LipVoicer>python inference_real_video.py
No module named 'ctcdecode'
melgen:
_name_: melgen
in_channels: 80
out_channels: 80
diffusion_step_embed_dim_i…
-
For the creation of a lip-reading dataset in SignWriting, we need to map IPA symbols to SignWriting.
The project will go like this:
1. Collect sign language videos with a known language (e.g. Engl…
-
论文中的ExpNet在训练过程中,基于deep3d渲染帧和音频分别作为输入,利用lipreading model分别输出预测的logits然后计算交叉熵loss,可是在复现的时候发现交叉熵loss大到离谱,在实测时发现需要大约1s的视频(即25帧)才有可能对齐音频所预测的logits。
请问lipreading loss的计算确实是仅依赖5帧渲染帧吗?还是使用了抽样策略选取5帧。
-
### Question
So as the title says, I am having issues with managing and sending the frames over a socket connection to a server that will process the frames in a flask backend. I tried going throug…
-
## Request for Mozilla Position on an Emerging Web Specification
* Specification Title: Web Speech API - SpeechRecognition API
* Specification or proposal URL: https://w3c.github.io/speech-api/
*…
-
请问现在还有途径可以获得LRW-1000数据集呢?
-
Hi,
My name is David and, first of all, congratulations for you good work. It is really interesting.
I am also working on lipreading and i would like to do a similar case study for my PhD thesi…
-
Hi, thanks for providing such extensive code and models for avhubert, setting up the finetuning worked like a charm! 🙏
However I have a few questions towards some of the hyperparameters from the pr…