NVlabs / nvdiffmodeling

Differentiable rasterization applied to 3D model simplification tasks
Other
455 stars 30 forks source link

Lower training resolution => heavier texture pixelation. But why? #3

Closed shrubb closed 3 years ago

shrubb commented 3 years ago

Hi, and thanks for great paper and comprehensive code! Could you please help me understand one thing:

I'm running configs/dancer_displacement.json. Despite "texture_res" being 2048, these 2048 x 2048 textures "learn" huge pixels at lower "train_res":

image

And this isn't just poor interpolation in Blender, but these are actual large solid blocks (of different scale!) in normal_map_opt and kd_map_opt! Here is a part of texture_n.png from the top left experiment, 1:1 scale:

image

Why do these maps get more pixelated instead of getting more blurry? I've spent couple days searching the code for the source of this behaviour, and I believe there are only three relevant lines (for now let's consider texture only):

Interpolation doesn't use 'nearest' method so it shouldn't be the source of big pixels. And because views are always different, rasterization should be a problem too. But probably I'm just missing some rendering subtleties?

Thanks again!

JHnvidia commented 3 years ago

Hi Egor, thanks for showing interest.

Before I give the detailed answer, you're misusing the system a bit. There's an implicit "contract" that you never will use the asset in a higher resolution than you specified. E.g. with train_res=256 you will get a model that looks good when rendered at a size of <= 256x256 pixels only. This is important because the blockiness will not show due to mip-mapping.

The more involved answer is that this is related to nvdiffrast and not this code base.. I'll give you the best of my understanding, but they could probably provide more details. There are two ways filtered textures can be handled, either custom trainable mipmaps (you can toggle this with the --custom-mip flag) or DX-like automatic mipmap generation (default).

If you're using automatic mipmaps, you bump in the case shown in the image below. bild At 512x512 training resolution you will never access the base miplevel, because the filtered texture lookup will pick the 512x512 miplevel at max. However, the base mip is the only one that have trainable values. Nvdiffrast assumes that the automatic mipchain is reduced using a 2x2 pixel mean operation. Back-propagating through a mean operation gives the same gradient for all inputs. Therefore, the base miplevel looks like it's been upscaled with nearest.

If you instead use --custom-mips you would notice that the higher miplevels would just contain whatever data they were initialized with (e.g. random noise). In that case, we never back-propagate to anything with higher resolution than used by the filtered mip-lookup.

TLDR: You can never expect data beyond the highest miplevel seen during training. That data is considered "don't know", and handled using nearest upscaling (that scaling is not really related to the filtered texture lookup).

Hope this helps, Jon

shrubb commented 3 years ago

Yeah, that was it! Indeed, I've optimized the same model (configs/dancer_displacement.json) with --custom-mips and have verified that, if for some UV region it has optimized a higher miplevel, the pixelated texture in question will be less blocky in that region -- in fact, as detailed as that miplevel.

Thanks a lot Jon! :relieved: