only one image for finetune

shrubb / latent-pose-reenactment

The authors' implementation of the "Neural Head Reenactment with Latent Pose Descriptors" (CVPR 2020) paper.

https://shrubb.github.io/research/latent-pose-reenactment/

Apache License 2.0

180 stars 34 forks source link

only one image for finetune #12

Closed visonpon closed 3 years ago

visonpon commented 3 years ago

Hi, @shrubb, thanks for your great work, and I train it normally, but I have some questions to ask:

First, just as mentioned above, when finetuning on only one image, the Id and pose extractor use the same image, and the results seem normal, since training on one image almost consumes litter time, it seems this meta-learning process has realized many to many's pose reenactment?

Second, I also try to add more images to train since only one image seems cannot guarantee the id and resolution, but more images result in a less accurate expression, so there has a trade-off between id and pose. I wonder if there are some other methods to solve this problem?

Hope you can give some advice, thanks~

shrubb commented 3 years ago

Hi, thanks. I couldn't understand the first question, could you please correct the typos and reformulate it? What do you mean by many to many?

To the second question: there's a trade-off indeed, but it's slightly different. I believe you're getting worse expression not because you used more images but because you took more training iterations. So be extra careful with those train.py parameters: for example, with batch size 1, --num_epochs=100 will give you 100 training iterations with a dataset of one image. However, if your dataset is 30 images, you'll accidentally get 3000 iterations which will of course overfit and destroy all mimics. Also check out the main README.md, it also touches on that matter.

visonpon commented 3 years ago

Sorry for this unclear expression, what I mean is that when trained on a certain person(e.g, A), the pose can from anyone(y=A, B, C, D...), this can call many-to-one's reenactment, but since train on only one image also works and it needs litter time, so based on the meta-learning process, the x can also change when saved all those x's model weights, this can call many-to-many's reenactment.

shrubb commented 3 years ago

Looks correct. Yes, you can reenact (drive) anyone by anyone. But first you'll have to fine-tune the meta-model to any new x.