dribnet / pixray

neural image generation
Other
402 stars 53 forks source link

Fixed padding issue in Kornia #23

Closed EleaZhong closed 2 years ago

EleaZhong commented 2 years ago

implemented some classes that overrides Kornia's classes; the functions that the classes call (from kornia.geometry) all have built in parameters for padding. fixed padding for RandomCrop and RandomAffine: they take different arguments for padding_mode😅. tested vqgan and pixel drawers, tested square and widescreen aspects. works fine.

dribnet commented 2 years ago

Hi - thanks for this work as I believe that the augmentations are crucially important to getting reliable results!

I've looked at this, I'm a bit confused as to what issues were being cased by the old version that are now fixed. Could you elaborate a bit on what issues were happening by the old padding style.

Some of these changes also seem to break the spirit of the wide augmentations. The purpose of these is to force evaluation of the entire image on a flat background. So for example, if we take a wide test input image (init_image):

test_pattern_wide2

And run this through the current version, we get a reasonable letterboxed wide result to a square:

live01

However with these changes the wide augmentations are now being processed with global_padding_mode (reflection) which is not what we want in this case:

live_im_00_32_None

For augs_wide we always want the padding mode to be zeros . Apologies that this was not clear before or documented anywhere.

One related ability I'd love to have is to be able to replace the padding=zeros mode (which letterboxes to black) with a padding=(R,G,B) mode, so I could dynamically change the letterbox color. I feel forcing the letterbox to be black is causing some weird biases at the edges. I have brought this up in kornia/kornia#901 (this comment). My intuition is that the results would be better with (at least) the ability to have a randomly selected grayscale background like this:

live01

But it's not clear if this is achievable in a custom subclass or needs upstream tooling in korina itself.