TMElyralab / MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Other
2.51k stars 306 forks source link

about bbox shift #96

Closed TigerHH6866 closed 4 months ago

TigerHH6866 commented 4 months ago

when source video is people talking,normarlly, and get Manually adjust range -x~y, now what bbox shift value should i use?? or how to get the right value? any suggestion?

gobigrassland commented 4 months ago

@czk32611 关于这个问题,发布的这个模型,在训练过程中,是固定mask掉下半张脸即mask[128:256, :] = 0吗?还是让mask的上边界根据计算出的可调整幅度,在训练过程中在这个合理范围随机选择呢?

czk32611 commented 4 months ago

即mas

目前模型训练时是固定mask掉下半张脸即mask[128:256, :] = 0

czk32611 commented 4 months ago

when source video is people talking,normarlly, and get Manually adjust range -x~y, now what bbox shift value should i use?? or how to get the right value? any suggestion?

You can find more explanation of bbox_shift here