Question about inference speed

YuDeng / Portrait-4D

Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data (CVPR 24); Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer (ECCV 2024)

MIT License

218 stars 9 forks source link

Question about inference speed #12

Open Pixie8888 opened 2 weeks ago

Pixie8888 commented 2 weeks ago

Dear authors,

You mentioned inference speed is 10 fps in the paper. Do you include the time of full BFM-to-FLAME transformation process? In my machine, it takes more than 1 minute to generate a result (include the time for Landmark detection, 3D face reconstruction, and cropping, BFM to FLAME parameter transformation).

I also have a question about training portrait4d-v2. Did the 3D synthesiser generate multi-view driving images on the fly during training shown in the fig 2? Could you please help me point out where is the relevant code?

YuDeng commented 1 week ago

Hi, the inference time does not take FLAME optimization into consideration.

For training potrait4d-v2, multi-view images are generating online via the 3D synthesizer. Relevant codes are here: https://github.com/YuDeng/Portrait-4D/blob/da5eec6fa3dfca4d7c8f08daa46326f2de8244db/portrait4d/training/loss/loss_recon_v2.py#L233; https://github.com/YuDeng/Portrait-4D/blob/da5eec6fa3dfca4d7c8f08daa46326f2de8244db/portrait4d/training/loss/loss_recon_v2.py#L336.

Pixie8888 commented 1 week ago

Thank you for your reply! I want to try training Portrait4d v2 on the toy dataset on a 24 GB gpu. I set the batch = 1 in portrait4d-v2-vfhq512-toy.yaml, but it shows error below:

How can I train the model with batchsize of 1?

---------------------------- update ------------------------------- I changed mbstd_group = 1 in default.yaml and batch=1 in portrait4d_v2_vfhq512_toy.yamlto train on 24 GB gpu. My question is changing batchsize to 1 will affect model's performance? Particularly what is the use ofmbstd_group```?

YuDeng commented 1 week ago

mbstd_group is a standard operation inherited from StyleGAN's discriminator to calculate some statistic values of real and fake data. A small mbstd_group may have slight influence to the performance of GAN loss.

Pixie8888 commented 1 week ago

Thank you for your reply. I have some other question regarding evaluation. Could you please share the code on how to compute the evaluation metric in Table 1? And the video ids that used for evaluation?

YuDeng commented 6 days ago

Sorry that the evaluation part is not available. I've switched job recently and is unable to reach the original code.