Closed RobertLuo1 closed 2 months ago
Hello RobertLuo1, I haven't done any data augmentation search, but the "transforms.Resize" will set the smallest border of the image to the target size, then the "transforms.RandomCrop" will randomly crop to the target size (along the longest border). This procedure is, if I am not mistaken, the correct way to preserve the image ratio and do the "RandomResizeAndCrop". Indeed, I also use random flipping as it is a common data augmentation technique. I commented the normalization because the VQGAN take input in the range [-1, 1] and I do it manually here
Best,
Victor
Thanks a lot! I see there exists a similar data preprocessing method in Pytorch called RandomResizedandCrop. I wonder if it is utilized in Maskgit? But after your detailed explanation, I believe what you say is the RandomResizeAndCrop. Thank you, Victor.
Hi! Actually, I was not aware that RandomResizedandCrop is available directly from Pytorch... It might work, but be aware of the ratio aspect, otherwise the model will also generated "distorted" images.
About the official MaskGIT, since they do not realize the training code, I can not help you, sorry.
Best,
Victor
Yeah, but the inner function of RandomResizedandCrop is cropped first then Resized. I think it is not quite the same with the current operation. I think the more feasible way is still using Resize and RandomCrop which is utilized in VQGAN Repo
Hi, Thanks for your wonderful work! I notice that maskgit uses RandomResizedandCrop for data augmentation. But I find that in the code you adopt Cropping and flipping and comment out the Normalization (since I think VQGAN is trained with normalization). https://github.com/valeoai/Maskgit-pytorch/blob/b0b2b3cc11cffd0b159f22dc1c6e73a7e8b53db3/Trainer/trainer.py#L80 I am curious about the reason behind.
Thanks in advance!