-
In the paper, it's show that MuseTalk training was performed on 2 NVIDIA H20 GPUs, and the Unet model was initially trained with L1 and perceptual losses for 200,000 steps. However, the paper doesn't …
-
你好,在我们的的部署中发现一个问题,即在VAE的模型中,将GPU上的数据拷贝到CPU上花费了巨量的时间。简单来说就是在不考虑这一步的情况下,实时性可以达到60+的fps。但是因为它的存在导致我们的性能只能在30fps左右。请问有没有什么办法在这个基础上做到优化呢?这是因为显卡位宽所导致的吗?我们的实验环境是4090。
-
Hey, it would be great to see a video of where this library is at and let new users or contributors see what they can make out of the library.
-
(1)最近也在复现这篇论文的训练代码,刚开始我选择reference image是距离target/gt image 一定距离外的帧,比如5-25帧外。之所以如此,是考虑很多视频脸部在不断运动且幅度比较大,比如抖音短视频中卖货的主播等等。今天首次看到有train_codes这个分支,其中代码一直是选择超过5帧以外的任意一帧作为reference image. 为什么这样选择,难道这种视频头部运动…
-
CentOS8 Python 3.9.6
pip install --no-cache-dir -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
-
图像流畅,音频卡顿:
声音连贯一段,卡顿一下,连贯一段,卡顿一下。如此反复。fps40多。
hqq93 updated
6 months ago
-
(musetalk) H:\MuseTalk\MuseTalk> python -m scripts.inference --inference_config configs/inference/test.yaml
please download ffmpeg-static and export to FFMPEG_PATH.
For example: export FFMPEG_PATH=/…
-
*********************************
creating avator: 12345
*********************************
preparing data materials ... ...
extracting landmarks...
reading images...
100%|█████████████████████…
-
需要如何配置吗
-
提供推理的音频,中间和结尾有一段没有声音的,但是嘴巴一直在动,请问可以怎么优化吗?