Thanks for your very great work! In the paper, you use patch-based finetune to make the model support large resolution editing, but the original pre-trained model is not finetuned. My concern is that when performing model guidance will the blending of the two predicted noises create artifacts due to mismatch?
Thanks for your very great work! In the paper, you use patch-based finetune to make the model support large resolution editing, but the original pre-trained model is not finetuned. My concern is that when performing model guidance will the blending of the two predicted noises create artifacts due to mismatch?