Open Danee-wawawa opened 1 year ago
Hi @Danee-wawawa,
Does the input size and output size support other sizes, such as 640512 or 512512?
This depends on several factors. In the simplest case -- if your data is images, and you would like to perform the translation between the images of the same size (e.g. 512 x 512
-> 512 x 512
), then this case is supported by uvcgan2
.
If possible, where the code needs to be modified?
In a case, as I have described above, one would need to modify the data configuration of the training script. Taking male2female
script as an example:
One would need to modify shape
parameter to match the desired shape, e.g. 'shape' : (3, 512, 512)
.
If your case is more complicated, more modifications may be required. Please let me know if you have further questions.
Thank you for your reply. Now,512 x 512 -> 512 x 512 is OK and I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?
I want to try 640 x 512 -> 640 x 512. Does this situation require modifying the network structure?
No, you do not need to modify the network structure for 640 x 512
images (modifying the shape
parameter should be enough). In general, as long as your image dimensions are divisible by 16, you can use the default network structure.
With that said, it may be helpful to tune the network structure a bit to achieve the best performance, but it is not necessary.
OK, thank you~~
你好@usert5432,
人们需要修改
shape
参数以匹配所需的形状,例如'shape' : (3, 512, 512)
.如果您的情况更复杂,则可能需要进行更多修改。如果您还有其他问题,请告诉我。
Sorry to bother you, I also have a question about image size. If my image shape is (3,512,512), then the
'shape' : (3, 256, 256),
changed to
'shape' : (3, 512, 512),
The following three lines:
'transform_train' : [
{ 'name' : 'resize', 'size' : 286, },
{ 'name' : 'random-crop', 'size' : 256, },
Does it need to be changed accordingly to
'transform_train' : [
{ 'name' : 'resize', 'size' : 512, },
{ 'name' : 'random-crop', 'size' : 256, },
Or what about other numbers?
Hi @Pudding-0503,
The data transformations are heavily dependent on the dataset that you have. For instance, if you have a large dataset (>= 5k images). And, if the objects that you want to translate have approximately the same size. Then, perhaps, you do not need to apply any transformations at all (or limit them just to a random horizontal flip).
And, in general, I would suggest to start with only random-horizontal-flip
transformation, e,g.
'transform_train' : [
'random-flip-horizontal',
],
And see how the translation works. If it does not work, then adjust the network hyperparameters (as described in the README). And, if it still does not work, then add new transformations.
OK! I got it, thank you very much~~~
Hi, I want to try 640 x 512 -> 640 x 512. And I modify the data configuration of the training script as following: But I get the following error: How to solve this problem? Looking forward to your answer.
Hi @Danee-wawawa. Can you try switching the order of dimensions? That is, setting shape = (3, 512, 640)
instead of (3, 640, 512)
.
It is OK, thank you~
Hi, I want to try 14601080 ->14601080, but I got this:
How to solve this problem?
Thank you for your reply
Hi @sophiatmu,
Unfortunately, I do not think uvcgan will work on images of size 1460 x 1080
. It expects image dimensions to be divisible by 16, but neither 1460 nor 1080 are divisible.
Hi,thank you for your work. Does the input size and output size support other sizes, such as 640512 or 512512? If possible, where the code needs to be modified? Looking forward to your answer.