Open yangdaowu opened 1 year ago
The issue #14 may help you do the inference with test set.
I changed the path to the video path of the test set, but it still doesn't work
https://github.com/dc3ea9f/vico_challenge_baseline/issues/14#issuecomment-1282043625 you can generate a fake video by duplicating image to len(mfcc), and use that video to create vox_lmdb for visualization. For more details, please refer to PIRender.
First, you should generate a fake video, then, make predictions and render it.
Sorry, bother again, how to generate fake videos?
Our videos are 30fps, and there is a mapping between the audio length and its mfcc feature length, get the mfcc feature length first, then you can just duplicate the first frame to the length of mfcc to generate the fake video.
If possible, could you provide corresponding steps? Thank you.
Hello, I don’t know how to submit this step. How should I follow the speaker video, first frame and reference image generation test set video?
I use the following code, but only generates the video of the training set.