Closed LYCEXE closed 1 week ago
请问你获得这个问题的答案了吗?
The single-scale patch-based sampling strategy, which composes the entire image by moving patches, has certain drawbacks, especially when the stride is large, leading to noticeable checkerboard artifacts. Therefore, in practice, we use smaller strides and multiple scales to avoid checkerboard artifacts. Additionally, we perform multiple multi-scale fusions on the intermediate samples xt in the diffusion model, rather than just one fusion, which improves image quality as shown in Figure 10. Moreover, using blocks of three scales is not essential; combining blocks of two different sizes usually performs better than using a single scale.
The single-scale patch-based sampling strategy, which composes the entire image by moving patches, has certain drawbacks, especially when the stride is large, leading to noticeable checkerboard artifacts. Therefore, in practice, we use smaller strides and multiple scales to avoid checkerboard artifacts. Additionally, we perform multiple multi-scale fusions on the intermediate samples xt in the diffusion model, rather than just one fusion, which improves image quality as shown in Figure 10. Moreover, using blocks of three scales is not essential; combining blocks of two different sizes usually performs better than using a single scale.
Thanks for your reply. But in the actual inference process, the sampling strategy based on single-scale patches composes the entire image without moving patches. In the denoising process of the diffusion model, the model's noise prediction for each patch is usually only related to the intra-patch information. Therefore, moving blocks does cause checkerboard artifacts. This phenomenon is usually caused by inconsistent denoising levels in different areas. Even if multi-scale fusion is not used, checkerboard artifacts are usually not caused if the patch segmentation is kept unchanged. In fact, the most common phenomenon is the artifact at the junction of patches.
The method of constructing the entire image by moving patchs mainly refers to WeatherDiffusion [TPAMI 2023]. To address the smoothness at the junctions of non-overlapping blocks, WeatherDiffusion involves moving and averaging the blocks to form the whole image. Therefore, we attempt to mitigate this issue by reducing the stride and using multiple scales.
I would like to ask how to understand “uneven overlapping” in the conventional patch-based sampling strategy. In other tasks, I did not find the checkerboard artifacts shown in the paper in the conventional patch-based sampling strategy. In addition, if the conventional patch-based sampling strategy has checkerboard artifacts, why can the multi-scale method proposed in this paper solve this problem?