ZPdesu / SEAN

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)
https://zpdesu.github.io/SEAN/
Other
649 stars 96 forks source link

Paired image to image translation #20

Open BEpresent opened 3 years ago

BEpresent commented 3 years ago

Thank you very much for making the code accessible!

I was pleasantly surprised that the training command works exactly as in SPADE.

Instead of semantic image synthesis, I am interested in paired image to image translation tasks as in pix2pixHD (domain A -> domain B with paired images of the same name), Over in the SPADE repo the same question was asked multiple times, however, with no answer yet:

https://github.com/NVlabs/SPADE/issues/46 https://github.com/NVlabs/SPADE/issues/112

Is it possible to have a paired image2image translation with the SEAN model? If not, can you recommend some state of the art model that is more recent compared to pix2pixHD? pix2pixHD is nice, but even the SPADE authors found several improvements over it in their ablation study in the SPADE paper.

Thank you.

kex243 commented 3 years ago

I was interested in this since I heard about SPADE. Now I had some time with code and can say something interesting. 1) It was possible run the code on the RGB to RGB images. It runs on windows, clean conda env, with setup of 3.7 python, 1.8 pytorch and all req according to txt file + it asked to download sync_batchnorm folder into networks folder before running even with 1 videocard. If someone needs more info about soft n hard I'll provide it. 2) Options to train is --batchSize 1 or 2 for two vidoecards, it works on windows --gpu_ids (0 or 1 or both), --contain_dontcare_label --label_nc 512 . Without two last parameters it won't run. Why 512 ? IDK Just trying to scale size of model into 8gb memory and it started to work, it doesnt work for me if it is 256 or 1024 or 0 or 1 and I didnt tried others. I aslo crop images to 256, it doesn fit memory req in other ways. It works with custom dataset that has folders outside of the workfolder, but cocos dataset setup with replacement of files in dataset folders works too. Also, it works with instance maps and without them, if --no_instance is chosen. As I remember, instances were hooked by small dataset and showed its influence on output image. 3) For a small dataset with 3 images only it showed capability to be overfited and to show correct response. Now Im trying to feed it with main dataset of 250k images. For now it has some color issues with first epoch, but it reminded me results of my first attempts to train on original pix2pix- it has the same color issues with firts epochs, hope it will be generalised. At least small dataset had no issues with color and forms after 200 epochs. Seems initialisation helps, which pix2pixHD lacks. It also has some issues with input image in results html folder, probably because of vectorising output image or marking original input files in the code, still looking how to fix it. But it doesn have any affect on result. Model G weight is 400 mb, not sure if it has same affect from input size as pix2pixHD has. Memory consumption for both cards are about 7 gb. Bottleneck is my HD drive, it would be better to use SSD for dataset at least for batch > 1.

questionwas paired to SEAN, it works for SPADE.