elvisyjlin / AttGAN-PyTorch

AttGAN PyTorch Arbitrary Facial Attribute Editing: Only Change What You Want
MIT License
248 stars 61 forks source link

the format of ./data/image_list.txt in CelebA-HQ #4

Closed thinkerthinker closed 5 years ago

thinkerthinker commented 5 years ago

Can you tell me the format of ./data/image_list.txt in CelebA-HQ?

elvisyjlin commented 5 years ago

It looks like this:

idx         orig_idx    orig_file   proc_md5                          final_md5
0           119613      119614.jpg  0be7e162e25c06f50dd5c1090007f2cf  d76ed3e87c8bc20f82757a2dd95026ba
1           99094       099095.jpg  1e2d301e9b3d1b64b2e560243b5c109c  c391ae358c1a00e715982050b6446109
2           200121      200122.jpg  99aa914661dc10dfbaf14579757f6ab8  a131f037e52aa1011a90cf78f7b0cd88
3           81059       081060.jpg  ff2a42ed253c393c3cebe37c456c47fb  9908aba4e7dbee68224726dd97ee21fa
4           202040      202041.jpg  8a249cfe4671d3f53eba90c6f1762005  464440733a22e53073575cd4081f5501
5           614         000615.jpg  b7a5764aa444aa2751055847dd09f0bc  017ae5a8bf65cf7350dd4683795191ae
6           50915       050916.jpg  c2a912357074ac7209cc2227bf9ca35d  30958d171935bc5fd5dc4cbfacbd9ebb
7           166545      166546.jpg  b0b48c4849c077b28fb13f51825a6eec  8d03e6ff3d927db8fad64e600ba4628b
8           143861      143862.jpg  ef679abb4540e36bf86753d754fe6bb2  5e4466bd00d2b8c3240d1a6e6bc0ddc9
...

If you follow the instructions here, you will get image_list.txt along with all CelebA-HQ images.

thinkerthinker commented 5 years ago

Thanks! I found this network structure a bit strange, the paper said that they used the unet, but only used a layer of shortcut, and used Concat 1024 with 512. But in general, it should be Concat 1024 with 1024, Concat 512 with 512. Is this more effective?

elvisyjlin commented 5 years ago

There is only one shortcut (512) because I trained the model with --shortcut_layers 1, as the author did in his repo. And --shortcut_layer 0 means we don't use any shortcut layer. If you want a fully U-Net architecture, please train with --shortcut_layers 5. As long as the number of shortcut layers is equal to number of layers of encoder and decoder, you will get a U-Net.

For example, you can train with the following commands:

CUDA_VISIBLE_DEVICES=0 \ 
python train.py \ 
--data CelebA-HQ \ 
--img_size 256 \ 
--shortcut_layers 5 \ 
--inject_layers 1 \ 
--experiment_name 256_shortcut5_inject1_none_hq \ 
--gpu

The more shortcut layers you use, the larger GPU memory it costs. However, the encoder-decoder reconstruction will be better since the decoder can see less compressed latent information. It's kind of a trade-off between performance and efficiency.

thinkerthinker commented 5 years ago

Thank you for your analysis and let me have a better understanding. @elvisyjlin

elvisyjlin commented 5 years ago

No problem. You're welcome!