Misleading (erroneous?) documentation

tomas16 commented 1 year ago

Summary

The example here uses code like:

N = 100
theta = dr.linspace(mi.Float, 0.0, dr.two_pi, N)

This results in theta == 0.0 and type(theta) == float. So theta is never a vector, even though that's clearly the intent of the example. As I understand it now, the type argument to linspace should be dr.scalar.ArrayXf.

System configuration

N/A

Description

Starting to use mitsuba and drjit. I was trying to load a mesh using trimesh (it supports more file formats, including .stl) and couldn't figure out how to convert it. In case others might find this useful, this works:

import trimesh
import mitsuba as mi
import numpy as np

tmesh = trimesh.load(stlfile)
mesh = mi.Mesh("mesh_name",
               vertex_count=tmesh.vertices.shape[0],
               face_count=tmesh.faces.shape[0],
               has_vertex_normals=True)
mesh_params = mi.traverse(mesh)
mesh_params['vertex_positions'] = np.asarray(tmesh.vertices).ravel()
mesh_params['faces'] = np.asarray(tmesh.faces).ravel()
mesh_params['vertex_normals'] = np.asarray(tmesh.vertex_normals).ravel()

That documentation example made me misunderstand the different types present in mitsuba. It made me think I needed the LLVM version to be able to use arrays like this, but that seems wrong.

I also thought Array3f would hold Nx3 or 3xN, so arrays of triplets/vectors, but apparently that's wrong and it literally holds 3 values.

Would be great if someone could clarify the basics of the type system a bit more (and update the example in the docs).

Steps to reproduce

N/A

tomas16 commented 1 year ago

To further illustrate the point, this was very counter-intuitive to me:

dr.linspace(dr.scalar.Array2f, 1, 10, 10)
Out[5]: [1.0, 10.0]
dr.linspace(dr.scalar.Array4f, 1, 10, 10)
Out[6]: [1.0, 4.0, 7.0, 10.0]
dr.linspace(dr.scalar.ArrayXf, 1, 10, 10)
Out[7]: [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]

njroussel commented 1 year ago

Hi @tomas16

The notebook/guide you refer to uses the LLVM backend (mi.set_variant('llvm_ad_rgb')). From your examples, I imagine you're using the scalar_rgb variant. The former uses vectorized types: a single Float is actually a N-wide vector of floating point numbers. The latter uses scalar types: a single Float is a plain python float. This at the very least explains the differences you've been encountering.

I'd stongly recommend looking at the Dr.Jit introduction documentation to get a better intuition for the difference between vectorization width and array length.

As you've pointed out, one can use the scalar.ArrayXf type for dynamically sized arrays when dealing with non-JIT types. The linspace function can only handle 1 dimensional arrays as per its documentation, hence it only being suited for scalar.ArrayXf or cuda/llvm.Float.

tomas16 commented 1 year ago

Hi @njroussel, thanks for clearing some of that up. I think I got confused somewhere along the way.

I'm used to numpy/matlab style of scientific programming. I guess that kind of maps to the llvm variant so I should probably use that. I thought I'd start with the scalar version and switch to llvm later (I had some deployment issues related to using libLLVM).

A few more questions:

Do I understand correctly that code written for one variant doesn't carry over when switching to a different variant? I imagine that maybe cuda and llvm are compatible, but you need to write fundamentally different code for the scalar variant.
Say I had a bunch of numpy code, for instance some code that generates rays for each camera pixel, similar to what's in one of the examples. It would use functions like linspace and meshgrid. I think that directly translates into the llvm version. However how would you implement the scalar variant? E.g. do you replace linspace with a for loop and essentially implement all your computations as scalar operations inside the loop body? Or do you use scalar.ArrayXf and do something more numpy-like?
I understand what you said about scalar.ArrayXf vs cuda/llvm.Float. The former is simply an array and you do whatever you want with it. The latter is like having a scalar float, but for many parallel computations, so they're stored in an array. However, what does llvm.ArrayXf represent semantically in that case? Is it an array of N floats, but parallel over M instances? Similar to how llvm.Array3f is N=3 1D arrays of X, Y and Z coordinates, each of length M?

mitsuba-renderer / mitsuba3