Question about image size for training

shunsukesaito / PIFu

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

https://shunsukesaito.github.io/PIFu/

Other

1.76k stars 341 forks source link

Question about image size for training #22

Closed ynyBonfennil closed 4 years ago

ynyBonfennil commented 4 years ago

It seems we can change the size of rendering image of apps.render_data by using -s augment like this: python -m apps.render_data -s 1024 -i ./rp_dennis_posed_004_OBJ/ -o ./data_1024px/

And we can also change the size of loading image of apps.train_shape like this (I changed the batch size because of my small gpu memory): python -m apps.train_shape --dataroot ./data_1024px/ --loadSize 1024 --random_flip --random_scale --random_trans --batch_size 1

I think this just changes the input size and won't affect the output size, but the output looks a bit bigger and head and foot are out of range. The output of 1st epoch looks like this snapshot00

Do we have to change more options to use 1024px input? or is this a problem of network design (HourGlass, MLP etc.) ?

shunsukesaito commented 4 years ago

Thank you for pointing out. Indeed there was a bug in the rendering code that does not properly set some camera parameters. Please pull the latest commit. render data again and see if it works.

ynyBonfennil commented 4 years ago

Thank you for your advice. I tried the latest commit and the construction of meshes looks good now. (The following output is from 100th epoch). snapshot01

It has some outliers and texture is misaligned, but this suggests that the current network design is optimized to 512px input and doesn't always work properly at different size, am I correct?

shunsukesaito commented 4 years ago

It seems that there're still some bugs remaining. In theory, the framework should support arbitrary image resolutions. So it should work, putting aside GPU memory constraints. But for more efficient high-resolution geometry learning, I found that applying multi-level approach is highly effective (see PIFuHD paper if you are interested). I'm currently quite occupied for other things, but I can circle back with you on this bug in 2-3 weeks. The bug shouldn't be too complicated.

shunsukesaito commented 4 years ago

I took a quick look at the code and intermediate results. The training side is working fine and even the intermediate results during training look okay. How did you generate the result above? If you train the model with 1024 and test on 512, I can imagine that results would look like this. Please make sure you train and test at the same image resolution.

ynyBonfennil commented 4 years ago

Sorry that was my mistake! I tried again from pulling the latest commit, and now it worked pretty good! snapshot02

It seems 95c6a10 did solve the problem. Thank you for your quick fix.