about the input/output size

LS4GAN / uvcgan2

UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation

https://arxiv.org/abs/2303.16280

Other

131 stars 21 forks source link

about the input/output size #8

Open Danee-wawawa opened 1 year ago

Danee-wawawa commented 1 year ago

Hi，thank you for your work. Does the input size and output size support other sizes, such as 640512 or 512512? If possible, where the code needs to be modified? Looking forward to your answer.

usert5432 commented 1 year ago

Hi @Danee-wawawa,

Does the input size and output size support other sizes, such as 640512 or 512512?

This depends on several factors. In the simplest case -- if your data is images, and you would like to perform the translation between the images of the same size (e.g. 512 x 512 -> 512 x 512), then this case is supported by uvcgan2.

If possible, where the code needs to be modified?

In a case, as I have described above, one would need to modify the data configuration of the training script. Taking male2female script as an example:

https://github.com/LS4GAN/uvcgan2/blob/8f4b1cbfeae74d5b5d1642cebe63c1660af873d6/scripts/celeba_hq/train_m2f_translation.py#L63-L76

One would need to modify shape parameter to match the desired shape, e.g. 'shape' : (3, 512, 512) .

If your case is more complicated, more modifications may be required. Please let me know if you have further questions.

Danee-wawawa commented 1 year ago

Thank you for your reply. Now,512 x 512 -> 512 x 512 is OK and I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?

usert5432 commented 1 year ago

I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?

No, you do not need to modify the network structure for 640 x 512 images (modifying the shape parameter should be enough). In general, as long as your image dimensions are divisible by 16, you can use the default network structure.

With that said, it may be helpful to tune the network structure a bit to achieve the best performance, but it is not necessary.

Danee-wawawa commented 1 year ago

OK, thank you~~

Pudding-0503 commented 1 year ago

你好@usert5432,

人们需要修改shape参数以匹配所需的形状，例如'shape' : (3, 512, 512).

如果您的情况更复杂，则可能需要进行更多修改。如果您还有其他问题，请告诉我。

Sorry to bother you, I also have a question about image size. If my image shape is (3,512,512), then the

  'shape' : (3, 256, 256),

changed to

  'shape' : (3, 512, 512),

The following three lines:

  'transform_train' : [
    { 'name' : 'resize', 'size' : 286, },
    { 'name' : 'random-crop', 'size' : 256, },

Does it need to be changed accordingly to

  'transform_train' : [
     { 'name' : 'resize', 'size' : 512, },
     { 'name' : 'random-crop', 'size' : 256, },

Or what about other numbers?

usert5432 commented 1 year ago

Hi @Pudding-0503,

The data transformations are heavily dependent on the dataset that you have. For instance, if you have a large dataset (>= 5k images). And, if the objects that you want to translate have approximately the same size. Then, perhaps, you do not need to apply any transformations at all (or limit them just to a random horizontal flip).

And, in general, I would suggest to start with only random-horizontal-flip transformation, e,g.

                'transform_train' : [
                    'random-flip-horizontal',
                ],

And see how the translation works. If it does not work, then adjust the network hyperparameters (as described in the README). And, if it still does not work, then add new transformations.

Pudding-0503 commented 1 year ago

OK! I got it, thank you very much~~~

Danee-wawawa commented 1 year ago

Hi, I want to try 640 x 512 -> 640 x 512. And I modify the data configuration of the training script as following: c6e5ef293725835e8645de0315b8ed5 But I get the following error: e26e28685f731d8b4157eb5cf7b89eb How to solve this problem? Looking forward to your answer.

usert5432 commented 1 year ago

Hi @Danee-wawawa. Can you try switching the order of dimensions? That is, setting shape = (3, 512, 640) instead of (3, 640, 512).

Danee-wawawa commented 1 year ago

It is OK, thank you~

sophiatmu commented 11 months ago

Hi, I want to try 14601080 ->14601080, but I got this:

How to solve this problem?

Thank you for your reply

usert5432 commented 11 months ago

Hi @sophiatmu,

Unfortunately, I do not think uvcgan will work on images of size 1460 x 1080. It expects image dimensions to be divisible by 16, but neither 1460 nor 1080 are divisible.