daniilidis-group / neural_renderer

A PyTorch port of the Neural 3D Mesh Renderer
Other
1.14k stars 254 forks source link

Expected format for textures? #13

Closed jcjohnson closed 6 years ago

jcjohnson commented 6 years ago

I'm having a hard time understanding the way that textures are represented inside the renderer.

It seems internally (e.g. from the texture optimization example) that textures are stored in Tensors of shape

(batch_size, num_faces, texture_size, texture_size, texture_size, 3)

which I understand as representing a 3D grid of size texture_size^3 of RGB value for each face. How does this relate to the per-vertex UV texture coordinates that are stored in .obj files?

The CUDA kernel for loading textures appears to be bilinearly sampling the RGB values from the texture images, but I am having a hard time understanding exactly what is happening inside this kernel and exactly what it returns. Can you give a brief explanation?

nkolot commented 6 years ago

Hi Justin, my CUDA kernel is the same as the original CUDA code in the Chainer inplementation by the authors. I am not an expert in Computer Graphics and I don't to give you an incorrect answer, so I think it will be better to ask the original authors. Please let me know if you have any additional questions.

jcjohnson commented 6 years ago

Thanks! I'll ask over at the original repo.

jbohnslav commented 6 years ago

Did you ever figure this out? I looked on the issues page of the original implementation, and didn't find your question.

jcjohnson commented 6 years ago

@jbohnslav Yes, after looking at the code again I figured it out.

For each face, the texture tensor contains a discretized set of samples (in barycentric coordinates) of linear combinations of the colors for each vertex; during rasterization these sampled colors are re-sampled to generate the actual colors of faces. The texture_size controls the number of color pre-samples per face.

More concretely, suppose texture_size = T. Then textures has shape (N, F, T, T, T, 3), with the following semantics (up to a possible permutation of vertices):

textures[n, f, T - 1, 0, 0] is the RGB color of vertex 0 for face f textures[n, f, 0, T - 1, 0] is the RGB color of vertex 1 for face f textures[n, f, 0, 0, T - 1] is the RGB color of vertex 2 for face f

Other elements of the textures matrix give linear combinations of the colors at vertices, so for example:

textures[n, f, 0, 0, 0] is 0 textures[n, f, T - 1, T - 1, T - 1] is the sum of the colors at the three vertices for f textures[n, f, T / 3, T / 3, T / 3] is equal to color1 / 3 + color2 / 3 + color3 / 3 (with T % 3 == 0) textures[n, f, T / 2, T / 2, 0] is equal to color1 / 2 + color2 / 2 (with T % 2 == 0)

jbohnslav commented 6 years ago

Thanks so much!