rasterize resolution - Githubissues

chky1997 commented 1 year ago

Hi, thank you for your work! I noticed that the CUDA rasterizer does not support output resolutions greater than 2048×2048. Do you have further plans to raise the resolution limit, maybe to 4096×4096? Or is that any method for me to use the rasterizer to process images with greater resolutions? Thank you!

s-laine commented 1 year ago

There is no simple way to increase the resolution, unfortunately. There is a tradeoff between subpixel accuracy and viewport size, and the internal calculations would either overflow or lose too much precision after 2048×2048.

If you can afford some performance degradation, you can rasterize the image in tiles. For example, you can rasterize a 4096×4096 image in four 2048×2048 tiles by scaling the vertex buffer x and y by 2.0 and then offsetting x and y by ±w depending on which quadrant you want to render.

After you combine the rasterizer outputs into a single large image, you should be able to use the rest of the ops (interpolation, antialias, etc.) as-is with the original vertex buffer.

s-laine commented 1 year ago

Here's an example implementation of using 2×2 tiles that can be used in place of dr.rasterize(). Thus, this function can be used to rasterize up to 4096×4096 resolution outputs, with the constraint that the dimensions must be divisible by 16. If higher resolutions are needed, the idea can be easily extended to use even more tiles.

This is not super efficient, but should be fine for rendering pipelines where the rasterization op is a small part of the overall cost.

def rasterize_tiled_2x2(glctx, pos, tri, resolution, ranges=None):
    tile_resolution = [x // 2 for x in resolution]
    pscl = pos * torch.tensor((2, 2, 1, 1), dtype=pos.dtype, device=pos.device)
    px = torch.nn.functional.pad(pos[..., 3:], (0, 3))
    py = torch.nn.functional.pad(pos[..., 3:], (1, 2))
    out_0, db_0 = dr.rasterize(glctx, pscl + px + py, tri, tile_resolution, ranges)
    out_1, db_1 = dr.rasterize(glctx, pscl - px + py, tri, tile_resolution, ranges)
    out_2, db_2 = dr.rasterize(glctx, pscl + px - py, tri, tile_resolution, ranges)
    out_3, db_3 = dr.rasterize(glctx, pscl - px - py, tri, tile_resolution, ranges)
    out = torch.cat([torch.cat([out_0, out_1], dim=2), torch.cat([out_2, out_3], dim=2)], dim=1)
    db = torch.cat([torch.cat([db_0, db_1], dim=2), torch.cat([db_2, db_3], dim=2)], dim=1)
    return out, db

Closing the issue now. If there is a need for a more efficient implementation, or the tiling implementation is inapplicable for some other reason, please open a new issue.

s-laine commented 1 month ago

For future reference, the CUDA rasterizer resolution limit was removed in v0.3.3.

NVlabs / nvdiffrast

rasterize resolution #128