Closed luchaoqi closed 1 year ago
We followed the PTI code to save our tuned generator params by torch.save(G.state_dict(), PATH), but I think we had a mistake in the code-single_id_coach.py line117 (saving generator checkpoint: save pickle(x) -> save state_dict(o)). Sorry for making you confused about the code. If you want to use the gen_videos.py directly, I recommend you save the network params by torch.save(G.state_dict(), PATH), and load it as torch.load_state_dict(PATH). Or you can generate the video right after the optimization is finished with this code.
Thanks! I find that for demo images the results are pretty stable and good but not the case for custom image-in-the-wild. I tested with my own dataset following the EG3D preprocessing code but find the results are not even fixed.
Is there any way to fix the results like seed=0
? Sorry for this basic question as I am new to the PTI area.
Can I take a look at your results?
Can I take a look at your results?
Attached (especially image 3 and 5), I suspect it might have something to do with the PTI in hyperparameters.py
?
first run:
results.zip
second run:
results_2.zip
I think both cases were caused by incorrect camera viewpoint, not the generator hyperparameter at PTI. In some cases, the extrinsic collapses(Especially when the input image is blurry) and the network try to fit the image rendered from an incorrect viewpoint at PTI stage.
You can try regulating the camera lr hyperparameter, or setting visualize_opt_process=True in global_config.py to monitor the optimization process.
I have a follow-up question regarding your implementation here: https://github.com/KU-CVLAB/3DGAN-Inversion/blob/3cfebf9abc0733aae5c5e512f33ce18d016e3e48/gen_videos.py#L84-L128
Seems that you directly feed the ws
into G
without the mapping network during inversion. I tried the same way but noticed that there are some artifacts like those shown here. This problem has also been discussed in the original EG3D repo here.
Did you incur similar problems and how did you solve this?
Our implementation directly optimizes ws, so we did not pay much attention on the mapping network. We tried to 'initialize' the ws by feeding gt cam of the input image to the mapping network, but I think the result was similar.
Hi, I would like to know how to generate a video given the output of PTI. I saw some pipelines here: https://github.com/NVlabs/eg3d/issues/66 and seems PTI outputs only the
.pt
network here andgen_videos.py
require thepkl
network format here I am wondering how did you save the network (.pkl
) and use that in thegen_videos.py
. Thanks!