Open vkhoi opened 2 years ago
Thank you for raising this point. The experiment in Appendix-Section B is achieved by fine-tuning the official Big-LaMa and CoModGAN checkpoint using object-aware masks for with 8 GPUs for ~3-4 days. Such a setting is fairer as training from well-trained checkpoints helps Lama-OT and CoModGAN-OT achieving their optimal performance.
Thank you for your answer. Can I ask some more questions about your experience finetuning Big-Lama with object-aware masks?
Thank you in advance!
Thank you for your answer. Can I ask some more questions about your experience finetuning Big-Lama with object-aware masks?
- During the finetuning process, did you ever encounter some inpainting results where Big-Lama would just inpaint a single color into the masked region (see attached)? This does not always happen, only sometimes which I find weird.
- Can you also share (just briefly is fine) your finetuning recipe for Big-Lama? For example, Big-Lama is trained on 256x256 crops of unresized image (as opposed to how CM-GAN is trained on 512x512 resized image), so did you keep the same 256x256 crop setting or did you switch to training on 512x512 resized image too? If you keep the 256x256 crop setting, then the mask generated by comodgan would be really huge and almost always occupy the whole 256x256 input image, so did you also modify the mask generation hyperparams of comodgan for Big-Lama?
Thank you in advance!
Thanks again. Just want to clarify one more thing since it was not explicitly stated in your answer. When generating object-aware mask to finetune LaMa, did you use masks from CoModGAN (i.e., this code) or did you keep the masks used by LaMa (i.e., this code)?
I used the object-aware mask generation procedure from Appendix E.2 to generate the mask. The code is in mask_generator/mask_generator.py
. Basically, it mixes comodgan masks and random object masks during training.
@htzheng Hi,I finetune the co-mod-gan use object-aware masks with pretrain weights of G and D. The learning rate is set to 0.0001. I try to use ls gan loss or softplus gan loss for D, some results always gererate the single feature and during the finetuning process, D loss will be small which means it can easily discriminate real images and fake images. Can you share (just briefly is fine) your finetuning recipe for co-mod-gan, thanks. Some results as follow:
@mingqizhang Hi, some configs that I used for fine-tuning comodgan (in the stylegan2-ada-pytorch fashion) is: --mirror=1 --gpus=8 --batch 32 --workers 4 --fp32 true --aug=noaug
for training. The detailed hyper parameters: {'comodgan512': dict(ref_gpus=8, kimg=50000, mb=32, mbstd=4, fmaps=1, lrate=0.001, gamma=10, ema=10, ramp=None, map=8),}
I guess using a larger learning rate, 8 GPU training and usually 2-3 days of training would achieve similar results.
@vkhoi Hello, I am also training LaMa-OT, but so far I haven't achieved very good results. I'm wondering how your training turned out. If you could share some experiences or results, I would greatly appreciate it. Thank you.
Dear authors. Thanks for the great work. Regarding the Lama-OT model in the Appendix-Section B, did you train it from scratch (so that it becomes aware of object mask right from the start), or is it possible to achieve the results in Fig.8 and Fig.9 (Appendix) by finetuning from Big-Lama? Thank you!