Open lelechen63 opened 2 years ago
Yes, in our exploration we also noticed that the StyleNeRF checkpoint is much harder to invert than the vanilla 2D StyleGAN. Some examples works ok, some did not. I haven't dig deep enough to explain why it is the case. I also see it will get much better to train an encoder first to find the approximated camera and w. Welcome for more discussion
Yes, in our exploration we also noticed that the StyleNeRF checkpoint is much harder to invert than the vanilla 2D StyleGAN. Some examples works ok, some did not. I haven't dig deep enough to explain why it is the case. I also see it will get much better to train an encoder first to find the approximated camera and w. Welcome for more discussion
Thanks for the quick reply. Are you using https://github.com/lelechen63/StyleNeRF/blob/main/apps/train_encoder.py to train the encoder, and use https://github.com/lelechen63/StyleNeRF/blob/main/apps/inversion.py to do the inversion? The inversion result you showed in the video looks great actually.
It would be tricky if the w can not be inverted. That means the output image and the latent space it not one-to-one mapping, and the stylenerf may not able to be directly used as a pretrained model for other tasks...
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase.
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase. Yes, theoretically, we should be able to fit w/z from the image. I followed the train_encoder.py to train the encoder, and use inversion.py to test. The training result looks reasonable (https://drive.google.com/file/d/1pCUzDhRCNbKFzfu6RiHULHXtG1Q2U6yj/view?usp=sharing). But when I apply the encoder and nerft to inverison.py to fit w, those are what I get (https://drive.google.com/file/d/1IRbBMMk34OlLHLZIvuW5dADWHFcQIF8-/view?usp=sharing, https://drive.google.com/file/d/1vm_CPysKQ2T-N_RFLDf1NLaHiAPMUWHx/view?usp=sharing). Do we encounter the similar problem? Thanks!
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase. Yes, theoretically, we should be able to fit w/z from the image. I followed the train_encoder.py to train the encoder, and use inversion.py to test. The training result looks reasonable (https://drive.google.com/file/d/1pCUzDhRCNbKFzfu6RiHULHXtG1Q2U6yj/view?usp=sharing). But when I apply the encoder and nerft to inverison.py to fit w, those are what I get (https://drive.google.com/file/d/1IRbBMMk34OlLHLZIvuW5dADWHFcQIF8-/view?usp=sharing, https://drive.google.com/file/d/1vm_CPysKQ2T-N_RFLDf1NLaHiAPMUWHx/view?usp=sharing). Do we encounter the similar problem? Thanks!
Hi! Thank you for your great contributions!
I've been studying the inversion of StyleNeRF. Are the results in the paper based on inversion.py (or with the help of pre-trained encoder)? What shuould be cleaned up in the inversion.py?
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase. Yes, theoretically, we should be able to fit w/z from the image. I followed the train_encoder.py to train the encoder, and use inversion.py to test. The training result looks reasonable (https://drive.google.com/file/d/1pCUzDhRCNbKFzfu6RiHULHXtG1Q2U6yj/view?usp=sharing). But when I apply the encoder and nerft to inverison.py to fit w, those are what I get (https://drive.google.com/file/d/1IRbBMMk34OlLHLZIvuW5dADWHFcQIF8-/view?usp=sharing, https://drive.google.com/file/d/1vm_CPysKQ2T-N_RFLDf1NLaHiAPMUWHx/view?usp=sharing). Do we encounter the similar problem? Thanks!
It seems like you have already successfully trained the encoder (while I am not as lucky and skilled as you). When i run train_encoder.py, i met some errors, one of them happened at "gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices)", written as "File "apps/train_encoder.py", line 188, in main gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices) File "", line 714, in get_final_output File "", line 707, in forward File "/nas/users/xiangjun/anaconda3/envs/py38-1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "", line 251, in forward File "/nas/users/xiangjun/git/StyleNeRF/./torch_utils/misc.py", line 85, in assert_shape if tensor.ndim != len(ref_shape): AttributeError: 'NoneType' object has no attribute 'ndim' " Did you meet similar errors before? If not, could you please offer some advise about how to run train_encoder.py (or train_encoder_resnet.py ) successfully? Great thanks!
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase. Yes, theoretically, we should be able to fit w/z from the image. I followed the train_encoder.py to train the encoder, and use inversion.py to test. The training result looks reasonable (https://drive.google.com/file/d/1pCUzDhRCNbKFzfu6RiHULHXtG1Q2U6yj/view?usp=sharing). But when I apply the encoder and nerft to inverison.py to fit w, those are what I get (https://drive.google.com/file/d/1IRbBMMk34OlLHLZIvuW5dADWHFcQIF8-/view?usp=sharing, https://drive.google.com/file/d/1vm_CPysKQ2T-N_RFLDf1NLaHiAPMUWHx/view?usp=sharing). Do we encounter the similar problem? Thanks!
It seems like you have already successfully trained the encoder (while I am not as lucky and skilled as you). When i run train_encoder.py, i met some errors, one of them happened at "gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices)", written as "File "apps/train_encoder.py", line 188, in main gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices) File "", line 714, in get_final_output File "", line 707, in forward File "/nas/users/xiangjun/anaconda3/envs/py38-1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "", line 251, in forward File "/nas/users/xiangjun/git/StyleNeRF/./torch_utils/misc.py", line 85, in assert_shape if tensor.ndim != len(ref_shape): AttributeError: 'NoneType' object has no attribute 'ndim' " Did you meet similar errors before? If not, could you please offer some advise about how to run train_encoder.py (or train_encoder_resnet.py ) successfully? Great thanks!
You can try the code here: https://github.com/lelechen63/StyleNeRF/tree/main/apps
I mean we haven't looked into the inversion heavily. In theory it should work as we did not change anything for the style part compared to styleGAN2. I just found it was a bit more difficult, and not all images can be inverted perfectly. Yes, I used them to train, although the codes of that part may need some changes to adapt the current version of code. I made some reorg for the codebase. Yes, theoretically, we should be able to fit w/z from the image. I followed the train_encoder.py to train the encoder, and use inversion.py to test. The training result looks reasonable (https://drive.google.com/file/d/1pCUzDhRCNbKFzfu6RiHULHXtG1Q2U6yj/view?usp=sharing). But when I apply the encoder and nerft to inverison.py to fit w, those are what I get (https://drive.google.com/file/d/1IRbBMMk34OlLHLZIvuW5dADWHFcQIF8-/view?usp=sharing, https://drive.google.com/file/d/1vm_CPysKQ2T-N_RFLDf1NLaHiAPMUWHx/view?usp=sharing). Do we encounter the similar problem? Thanks!
It seems like you have already successfully trained the encoder (while I am not as lucky and skilled as you). When i run train_encoder.py, i met some errors, one of them happened at "gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices)", written as "File "apps/train_encoder.py", line 188, in main gen_img = G.get_final_output(z=None, c=None, styles=w_samples, camera_matrices=camera_matrices) File "", line 714, in get_final_output File "", line 707, in forward File "/nas/users/xiangjun/anaconda3/envs/py38-1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "", line 251, in forward File "/nas/users/xiangjun/git/StyleNeRF/./torch_utils/misc.py", line 85, in assert_shape if tensor.ndim != len(ref_shape): AttributeError: 'NoneType' object has no attribute 'ndim' " Did you meet similar errors before? If not, could you please offer some advise about how to run train_encoder.py (or train_encoder_resnet.py ) successfully? Great thanks!
You can try the code here: https://github.com/lelechen63/StyleNeRF/tree/main/apps
Hello, I use your codes but I stiil get the bugs as above. Could you tell me how to fix it in detail? Thank you!
Thanks for the great work!
I trained the model by myself (result image: https://drive.google.com/file/d/15PWiwtTUYJzB86CsxeO7g_0xLpopNmFc/view?usp=sharing). When I use your inversion.py/projector.py in apps to fit w, the w can not be converged. Video result of projector.py (https://drive.google.com/file/d/1vFLrITdVZ7PuOZDzysZmtxMhZdvT20m2/view?usp=sharing) and video result of inversion.py (https://drive.google.com/file/d/18WoIEIffdA9dG8sgxMGc3Hhkk0SBAoOD/view?usp=sharing). PS: I did not use an encoder network
Could you please take a look at it?