BachiLi / redner

Differentiable rendering without approximation.
https://people.csail.mit.edu/tzumao/diffrt/
MIT License
1.39k stars 139 forks source link

UV / Depth at edges #101

Closed tetterl closed 4 years ago

tetterl commented 4 years ago

I'm currently playing around with "rendering" the uv channel additional to the other channels. I observe a strange behavior as it can be seen here: https://colab.research.google.com/drive/1znzLKboP8xAf2vzt2vKkIiAOtIAPyx6O (or below).

As far as I understand all ray samples that don't hit an object are assigned the depth 0 and uv coordinates (0, 0). In the following averaging of the samples these zero values can give quite unintuitive results. E.g. a pixel centered on an edge gets the depth 0.5 even if the depth of the edge is 1.0. Is this a bug or desired behavior? My intuition would be that for some of the pyredner.channels only samples that intersect with a shape should be used for averaging.

Furthermore it seems that the rendering (e.g. diffuse_reflectance) is correct with this respect but it doesn't correspond with the uv_coords returned by the renderer (i.e. if we would use the uv coordinates for sampling from the texture/mipmap we would get a different result.)

import os
import tensorflow as tf
import pyredner_tensorflow as pyredner
import redner

os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
tf.compat.v1.enable_eager_execution()  # redner only supports eager mode

# Use GPU if available
pyredner.set_use_gpu(tf.test.is_gpu_available(cuda_only=True, min_cuda_compute_capability=None))
# Square
#  (-0.5, 0.5, 0) ----- (0.5, 0.5, 0)
#           |               |
#           |               |
#  (-0.5,-0.5, 0) ----- (0.5,-0.5, 0)
vertices = tf.convert_to_tensor([
    [-0.5, -0.5, 0.0],
    [0.5, -0.5, 0.0],
    [-0.5, 0.5, 0.0],
    [0.5, 0.5, 0.0]
])
indices = tf.convert_to_tensor([
    [0, 1, 2],
    [2, 1, 3]
], dtype=tf.int32)
uvs = tf.convert_to_tensor([
    [0.0, 1.0],
    [1.0, 1.0],
    [0.0, 0.0],
    [1.0, 0.0]
])

# Materials
mat = pyredner.Material(
    diffuse_reflectance=tf.convert_to_tensor([0.5, 0.5, 0.5]))
materials = [mat]

# Shapes
shape = pyredner.Shape(vertices, indices, 0, uvs)
shapes = [shape]

pixels = 3
# Setup camera: We use an orthographic camera just to
#               make the projection more "2D": the depth is only used
#               for determining the order of the meshes.
cam = pyredner.Camera(position=tf.convert_to_tensor([0.0, 0.0, 1.0]),
                      look_at=tf.convert_to_tensor([0.0, 0.0, 0.0]),
                      up=tf.convert_to_tensor([0.0, 1.0, 0.0]),
                      clip_near=1e-2,  # needs to > 0
                      resolution=(pixels, pixels),
                      camera_type=redner.CameraType.orthographic)

# square: [-0.5, 0.5] x [-0.5, 0.5]
# camera sees: [-0.75, 0.75] x [-0.75, 0.75]
# expected coverage / alpha:
# 1/4, 1/2, 1/4
# 1/2, 1/1, 1/2
# 1/4, 1/2, 1/4
cam.cam_to_world = pyredner.gen_look_at_matrix(
    cam.position,
    cam.look_at,
    cam.up) @ pyredner.gen_scale_matrix(tf.ones(3) * 0.75)

# Setup the scene. We don't need lights.
scene = pyredner.Scene(cam, shapes, materials, [])
channels = [pyredner.channels.uv, pyredner.channels.depth, pyredner.channels.alpha]
res = pyredner.render_generic(scene=scene, channels=channels, max_bounces=0,
                              sampler_type=redner.SamplerType.independent,
                              num_samples=(2**14, 1))

uv = res[:, :, 0:2]
depth = res[:, :, 2]
alpha = res[:, :, 3]
u = uv[:, :, 0]
v = uv[:, :, 1]
# "wrong" / unintuitive values
print(u)
print(v)
print(depth)

# correct alpha
print(alpha)

# more intuitive values
print(u / alpha)  # if alpha != 0
print(v / alpha)
print(depth / alpha)
BachiLi commented 4 years ago

It gives the average. This is necessary for the result to be differentiable. You can use 1 sample per pixel if you don't want this behavior.

tetterl commented 4 years ago

Thanks. So basically if I want to get the 'correct' uv maps (or depth) I have to mask all the (0,0) outside of the shape and divide the others by alpha as I do at the bottom of the example. If yes isn't this a differentiable operation?

BachiLi commented 4 years ago

It depends on which variable you differentiate with. What you do still introduce discontinuities at the object boundaries w.r.t. object movements. Imagine your object move from one pixel to another pixel. The pixel it moves to will suddenly be populated with the object's uv values, creating discontinuities.

tetterl commented 4 years ago

Ah I see, thanks for explaining! At the moment I'm only trying to learn the texture (static object/camera). Thus it's helpful to generate the uv coordinates in the pre-processing and then reuse them in the training phase to prevent the rendering cost by only sampling the texture at the uv coordinates.

BachiLi commented 4 years ago

No problem. Feel free to propose any fix to this. My only suggestion is to keep the original behavior as the default one, and the other behaviors can be turned on through options.

tetterl commented 4 years ago

It might be able to provide an option that allows for such outputs (if the relevant inputs don't require a gradient). But since we can't detect that with TensorFlow this seems to be a rather inconsistent interface. I'll close this issue for the moment.