[ICLR 2024] Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach Link: https://arxiv.org/abs/2401.15652
Hello, according to the UViT network in the code, if the image size during training and output can only be the shape of (192,192)? But in the article, I remembered that we can denote a (w, h) as the predefined resolution to be generated. Can this be achieved without adjusting the network structure? Thank you so much for your reply.
Hello, according to the UViT network in the code, if the image size during training and output can only be the shape of (192,192)? But in the article, I remembered that we can denote a (w, h) as the predefined resolution to be generated. Can this be achieved without adjusting the network structure? Thank you so much for your reply.