Mine-525 commented 4 years ago

Hi, I tested custom integrator in python and encountered a strange behavior of ray_intersect().

System configuration

Windows10
latest commit (5d40a40b94fcad283d7a3fe0353ff5fad0d59ba0)
I used 3 variants, scalar_rgb, packet_rbg and gpu_rgb

problem description

I tested a function ray_intersect() in a simple scene including one box and floor. In a attached scripts, rays should intersect at top of the box. test_si.zip

With scalar_rgb, while the vertical ray intersects at top of the box properly, the slanted ray doesn't intersect with the box and does with floor. gpu_rgb works well about both rays. With packet_rgb, executing ray_intersect() causes python crashing in attached script even though depth_integrator from sample codes works well with packet_rgb.

In the page "Custom plugins in Python" from the documentation of Mitsuba2, it is said that Only gpu_ variants will produce reasonable performance when using such Python implementations of core system component. Is this means that not only overriding plugins, but also existing Mitsuba's methods, e.g. rayintersect(), works well only with `gpu` variants in python?

tomasiser commented 4 years ago

Hi, I also noticed strange behavior of scene.ray_intersect in Python with packet_rgb. Generally speaking, I am getting nonsense intersection data, and they are not even deterministic even though the code I wrote was deterministic. I suspect there is uninitialized memory or overflows happening somewhere, which causes the intersections to give nonsense non-deterministic results.

This week, I will try to write the simplest possible Python script for reproducing what I saw, unless someone else is already working on fixing the issue.

Speierers commented 4 years ago

Hi,

This week, I will try to write the simplest possible Python script for reproducing what I saw, unless someone else is already working on fixing the issue.

Please do, it would be really helpful to debug this issue!

Thanks!

tomasiser commented 4 years ago

I think I managed to find out what exactly causes the bug. :)

I am actually not sure if the bug I am seeing is the same that @Mine-525 reported, but looking at their code sample, I think it could be exactly the same issue.

Below I attached a simple Python script that I used in a Jupyter notebook. It is using a custom hard-coded integrator in Python, which does the following:

Shoot rays from the camera sensor (sensor.sample_ray_differential) to the scene (which contains a cube).
1. Detect interaction with scene.ray_intersect.
2. Rays that miss the scene get a cyan color.
3. Rays that hit the scene geometry get a yellow color.
Shoot new rays from the first intersection using interaction.spawn_ray.
1. Detect interaction with scene.ray_intersect.
2. Rays that hit the scene geometry again get a red color.

Now, ideally, because a cube is a closed mesh, all pixels should be either cyan (no cube) or red (hit cube twice):

However, if you run the script a couple of times, again and again, you notice that sometimes you get very weird results:

While debugging, I noticed that the first scene.ray_intersect called with rays from sensor.sample_ray_differential works fine. The cube is never cyan, and the sky is never red or yellow, which means the first intersections are detected fine.

So the bug only happens when scene.ray_intersect is called with rays generated by interaction.spawn_ray. So I started investigating the rays and noticed that the maxt of these rays is just a 1-element array [inf], but everything else like o, d, mint, and time are arrays that has 300*300=90000 elements (number of pixels):

So I tried rewriting a part of the script:

# Advance the rays to the first surface interaction:
rays = interaction.spawn_ray(rays.d)
rays.maxt = ek.zero(Float, 300*300)
rays.maxt += float("inf")

so that maxt is a proper Enoki array of the correct dimension:

and that solved the bug! So I believe that the scene intersection code was touching invalid memory, which resulted in the weird patterns. Similarly, in the script of @Mine-525, they generate their custom rays without any mint or maxt explicitly defined, which I believe could cause the crashes.

The code for Interaction::spawn_ray in Mitsuba 2 looks like this:

/// Spawn a semi-infinite ray towards the given direction
Ray3f spawn_ray(const Vector3f &d) const {
    return Ray3f(p, d, (1.f + hmax(abs(p))) * math::RayEpsilon<Float>,
                 math::Infinity<Float>, time, wavelengths);
}

I think the problem is the math::Infinity<Float> not being a properly sized Enoki array.

Attached files:

Click to expand the whole Python script:

```python import os import enoki as ek import numpy as np import mitsuba mitsuba.set_variant('packet_rgb') from mitsuba.core import Thread, ScalarTransform4f from mitsuba.core.xml import load_file, load_dict, load_string from mitsuba.python.util import traverse from mitsuba.core import Float, UInt32, UInt64, Vector2f, Vector3f from mitsuba.core import Bitmap, Struct, Thread from mitsuba.core.xml import load_file, load_dict from mitsuba.render import ImageBlock def build_scene(): camera_target = np.array([0,0,0]) camera_pos = camera_target + [7,4,7] film = load_dict({ "type": "hdrfilm", "rfilter": {"type": "box"}, "width": int(300), "height": int(300), "pixel_format": "rgb" }) sampler = load_dict({ "type": "independent", "sample_count": 1 }) sensor = load_dict({ "type": "perspective", "near_clip": 1, "far_clip": 1000, "to_world": ScalarTransform4f.look_at(target=camera_target, origin=camera_pos, up =[0.0, 1.0, 0.0]), "fov": 60, "myfilm": film, "mysampler": sampler }) slab = load_dict({ "type": "obj", "filename": "unit_cube_centered.obj", "bsdf": {"type": "null"}, "to_world": ScalarTransform4f.scale([4,4,4]) }) scene = load_dict({ "type": "scene", "mysensor": sensor, "slab": slab }) return scene scene = build_scene() # Instead of calling the scene's integrator, we build our own small integrator # This integrator simply computes the depth values per pixel sensor = scene.sensors()[0] film = sensor.film() film_size = film.crop_size() spp = 1 # Seed the sampler total_sample_count = ek.hprod(film_size) * spp # Enumerate discrete sample & pixel indices, and uniformly sample # positions within each pixel. pos = ek.arange(UInt32, total_sample_count) pos //= spp scale = Vector2f(1.0 / film_size[0], 1.0 / film_size[1]) pos = Vector2f(Float(pos % int(film_size[0])), Float(pos // int(film_size[0]))) pos += Vector2f(0.5,0.5) # Sample rays starting from the camera sensor rays, weights = sensor.sample_ray_differential( time=0, sample1=0.5, sample2=pos * scale, sample3=0 ) # RGB color of the samples, initialized by white (1.0,1.0,1.0) result = ek.zero(Vector3f, total_sample_count) result += 1.0 # Intersect rays with the scene geometry interaction = scene.ray_intersect(rays) active = interaction.is_valid() # Samples that are inactive from the beginning did not hit the scene geometry at all: result[~active] = Vector3f(0.0,1.0,1.0) # cyan # Advance the rays to the first surface interaction: rays = interaction.spawn_ray(rays.d) #rays.maxt = ek.zero(Float, 300*300) #rays.maxt += float("inf") # Rays that hit the scene geometry for the first time get a yellow color: result[active] = Vector3f(1.0,1.0,0.0) # Intersect the advanced rays with the scene geometry for the 2nd time: interaction = scene.ray_intersect(rays) active &= interaction.is_valid() # Rays that hit the scene geometry for the second time get a red color: result[active] = Vector3f(1.0,0.0,0.0) block = ImageBlock( film.crop_size(), channel_count=5, filter=film.reconstruction_filter(), border=False ) block.clear() # ImageBlock expects RGB values (Array of size (n, 3)) block.put(pos, rays.wavelengths, result, 1) # Write out the result from the ImageBlock # Internally, ImageBlock stores values in XYZAW format # (color XYZ, alpha value A and weight W) xyzaw_np = np.array(block.data()).reshape([film_size[1], film_size[0], 5]) # Create a Bitmap and show it using matplotlib: bmp = Bitmap(xyzaw_np, Bitmap.PixelFormat.XYZAW) bmp = bmp.convert(Bitmap.PixelFormat.RGB, Struct.Type.UInt8, srgb_gamma=True) image_np = np.array(bmp) from matplotlib import pyplot as plt plt.figure(figsize=(8,8)) plt.imshow(image_np) plt.show() ```

Click to expand unit_cube_centered.obj:

```obj # Blender v2.82 (sub 7) OBJ File: '' # www.blender.org # mtllib unit_cube_centered.mtl o Unit_Cube_Centered v -0.500000 0.500000 -0.500000 v 0.500000 0.500000 0.500000 v 0.500000 0.500000 -0.500000 v -0.500000 0.500000 0.500000 v -0.500000 -0.500000 0.500000 v 0.500000 -0.500000 -0.500000 v 0.500000 -0.500000 0.500000 v -0.500000 -0.500000 -0.500000 vn 0.0000 1.0000 0.0000 vn 0.0000 -1.0000 0.0000 vn 0.0000 0.0000 1.0000 vn 1.0000 0.0000 0.0000 vn 0.0000 0.0000 -1.0000 vn -1.0000 0.0000 0.0000 # usemtl None s off f 1//1 2//1 3//1 f 2//1 1//1 4//1 f 5//2 6//2 7//2 f 6//2 5//2 8//2 f 5//3 2//3 4//3 f 2//3 5//3 7//3 f 2//4 6//4 3//4 f 6//4 2//4 7//4 f 6//5 1//5 3//5 f 1//5 6//5 8//5 f 5//6 1//6 8//6 f 1//6 5//6 4//6 ```

Mine-525 commented 4 years ago

Hi, @tomasiser. Thank you for your great additional investigation!

I can solve this problem by your detailed report, and I'm glad that the great report comes from my little one!

tomasiser commented 4 years ago

Great, glad it helped! :)

I also want to point out that similar problems can easily occur with other structs, e.g., Interaction. For example, if you initialize a custom interaction like this:

i = mitsuba.render.Interaction3f()
i.t = some_distances
i.p = some_positions

this will give you invalid interactions, because of missing i.time, which, again, needs to be a properly sized Enoki array. Same probably goes for i.wavelengths, although that is not used for trivial scene intersections.

Small mistakes like these do not throw any exceptions and can often silently crash the Python kernel.

Anyway, I believe the Interaction::spawn_ray implementation is bugged in Mitsuba 2 itself.

Speierers commented 4 years ago

Hi @tomasiser,

Thanks a lot for this very detailed investigation!

I think the implematation of Interaction::spawn_ray is fine as enoki knows how to deal with operation involving dynamic arrays and a dynamic array of size one.

Although the problem arises when working with those arrays outside of the enoki context, for instance accessing data directly via the array data pointer. This happens in few places accross the codebase, and apparently we don't handle this case properly everywhere.

See for instance this where the ray struct is "resized" before we pass it's data pointer to Optix.

Are you using embree here? Or the mitsuba kd-tree?

tomasiser commented 4 years ago

Are you using embree here? Or the mitsuba kd-tree?

I am not using embree. I also tried ray_intersect_naive with the same result.

If it works with OptiX, then perhaps the bug is in the kd-tree itself?

Speierers commented 4 years ago

When using ray_intersect_naive(), the kd-tree isn't even used. This sounds more like an enoki bug at this point.

The current refactoring involves a complete redesign of the packet_* modes and this bug will most likely go away. If you are fine with the workaround you mentioned above, I would wait for the refactored code to be merged.

How does that sound to you?

tomasiser commented 4 years ago

How does that sound to you?

Sounds fine! :-) We should remember to check the bug again after the refactoring. Looking forward!

mitsuba-renderer / mitsuba2

ray_intersect() behaves strange in Python #206

System configuration

problem description

Attached files: