Closed MessyPaste closed 3 years ago
We convert the video to 25f/s, so 64 frames are roughly 2.55 seconds.
We convert the video to 25f/s, so 64 frames are roughly 2.55 seconds.
Thanks for your quick reply!
So "2.55s" is tailored to 64 dimensions, which is consistent with the face embedding dimensions.
Am I right?
Thanks for your great work. I was curious about the parameter of num_frames. Why we only get 64 frames of mouth ROI for 2.55 seconds? The end of 10 frames is abandoned? I can't figure it out. Thanks again.