Open aishoot opened 2 years ago
You can try to process the dataset as a sequence of .png
frames (use --format .png
when running load_videos.py), so that you only need to read two png images instead of the entire video during training, which can reduce memory usage.
You can try to process the dataset as a sequence of
.png
frames (use--format .png
when running load_videos.py), so that you only need to read two png images instead of the entire video during training, which can reduce memory usage.
Thanks. It really works. Another question: If I want to imitate the expression, head and body movement at the same time, any good ideas?
According to experiments on TED-talks, facial expression transfer is not very good.
Because the region of the face in the image is too small, and the facial motions are too small compared to the body motions, making it difficult for the model to learn.
Maybe it's possible to do this with multiple models working at the same time.
For example: for the whole image, use the body motion transfer model. At the same time, face detection is performed on the image, and the facial motion transfer model is used in the face region.
OK, thanks a lot. Got it
According to experiments on TED-talks, facial expression transfer is not very good.
Because the region of the face in the image is too small, and the facial motions are too small compared to the body motions, making it difficult for the model to learn.
Maybe it's possible to do this with multiple models working at the same time.
For example: for the whole image, use the body motion transfer model. At the same time, face detection is performed on the image, and the facial motion transfer model is used in the face region.
could you add this? a script to combine vox and ted
Thanks for your nice work! I met a problem while I'm training on the TED dataset (Two 32G GPUs).
Thanks for your replay!