yxuhan / AdaMPI

[SIGGRAPH 2022] Single-View View Synthesis in the Wild with Learned Adaptive Multiplane Images
209 stars 24 forks source link

It seems that the model does not work for the provided input pairs of images and depths. #11

Closed JinWonjoon closed 1 year ago

JinWonjoon commented 1 year ago

Thanks for releasing your nice work of novel view synthesis from a single input image!

I hope you to help me for performing novel view synthesis correctly with your codes.

The problem I found is that it synthesizes awakard set of images (video) for input images even for the provided inputs.

I had experimented for the provided images using command below.

python gen_3dphoto.py --img_path images/0810.png --save_path ./results/0810.mp4

This seems to work fine.

0810_ver1

However, the result with the command below does not seem to work and looks very awkward.

python gen_3dphoto.py --img_path images/0801.png --save_path ./results/0801.mp4

0801_ver1

I suspected that the width and height result in the problem, so I tuned them for 512 x 384 (H x W) which are similar to the height, width ratio of the image (0801.png).

python gen_3dphoto.py --img_path images/0801.png --save_path ./results/0801.mp4 --width 512 --height 384

Unfortunately, it does not seem to bring about the problem as shown below.

0801_ver3

I'm really thank you to tell me if you know the reason of the awkward results.


And, I have one more question. When I change the rendering path from on the xz-plane to on the xy-plane (z-axis is the viewing direction), it seems the results show little bit awkward images, such as strecthed pixels on the boundary of the object.

# in utils/utils.py
# from
swing_path_list = gen_swing_path()
# change to
swing_path_list = gen_swing_path(r_x=0.3, r_y=0.3, r_z=0.)

0810

If you know the reason for the stretched pixels , please let me know.

Thanks!

Best wishes, Jin.

yxuhan commented 1 year ago

@JinWonjoon

For the first question, you need to specify --disp_path when testing our method. By default, the --disp_path is set to the depth of 0810 (the squirrel example), and that's why the 3d photo of the penguin example seems strange.

For the second question, MPI cannot synthesize out-of-fov contents. The stretched pixels on the boundary are the out-of-fov regions; their color is determined by the padding mode of 'grid_sample', not synthesized by our network.

JinWonjoon commented 1 year ago

Oh, it was my mistakes! Thanks for your help :)