Caoang327 / fwd_code

CVPR 2022: https://caoang327.github.io/FWD/
71 stars 4 forks source link

training on resized image #5

Closed monkeydchopper closed 9 months ago

monkeydchopper commented 9 months ago

Thanks for your work. I have a question about training on resized images. I want to train on DTU images of size (192,256), and I noticed that the DTU_Dataset code automatically handles the resizing for both image and intrinsics. However, when I resize the image, the PSNR drops to around 13 after 10000 steps of training. I'm unsure why this is happening, as I don't see any theoretical flaws that could result in this.

This is the result of 7000 steps for resized image 1

This is the result of 7000 steps for original size image 2

Do you have any clue why this is happening? Thank you!

Caoang327 commented 9 months ago

Are you using the same DTU data and resizing everything based on that, or you download a resized version of DTU from other places?

monkeydchopper commented 9 months ago

Thanks for your response! I'm using the 4x downsampled dtu you provided. But I have fixed the pytorch3d version problem, I don't know if that's related to this problem. I have verified the fix through test cases. For now, training on original size (400, 300) and can reproduce similar result in the paper, but if I give a different image size, for example, (256, 192) or (512, 384), the PSNR will no longer increase after around 12.

monkeydchopper commented 9 months ago

For another experiment I did, I try to train the model on (200, 150), and I cannot even get it trained on (200, 150). I see the code apply a downsample before feed the image into encoder, so the actual input dimension for encoder is (200, 150). And I tried resize the image to (200, 150) first in the dataloader and input it into the network with downsample turned off, and I also disable the upsampling for decoder, so I get (200, 150) image as output. But the PSNR stop to increase after reaching 9. That's so wierd.

I'm thinking of building a downsample dtu for myself, did you follow the same logic of resizing in the code when downsampling the dataset?

monkeydchopper commented 9 months ago

fix it through building a downsample dtu by myself, it might due to the low resolution image.

Caoang327 commented 8 months ago

Thanks for the updates. Not sure about why downsample in the encoder cannot work. Anyway, please ensure the intrinsic parameters match the downsampled image resolution