sampotter / python-embree

Apache License 2.0
20 stars 7 forks source link

NULL pointer error #7

Closed aluo-x closed 3 years ago

aluo-x commented 3 years ago
  embree.BufferType.Vertex, # buf_type
  File "embree.pyx", line 528, in embree.Geometry.set_new_buffer
  File "embree.pyx", line 453, in embree.array_from_ptr
  File "embree.pyx", line 449, in embree.typed_mv_from_ptr
ValueError: Cannot create cython.array from NULL pointer

It seems like after processing many files, embree encounters a null ptr error.

aluo-x commented 3 years ago

Tagging @sampotter This seems to occur exactly after 497 calls to device = embree.Device()

sampotter commented 3 years ago

Haha, weird! Thanks for pointing this out.

Could you please provide a minimal example I can use to reproduce this? Also, some basic information about what system you're running this on would be helpful (OS, version of Cython, version of Embree, etc.).

Probably the issue has to do with the way I'm using the Embree C API (the docs are good but can be a little vague).

sampotter commented 3 years ago

@aluo-x, in fact, I can guess what it is---Embree probably doesn't expect the user to create an infinite number of devices. A more common pattern is to create just a couple of devices and use them for everything. I would restructure your code a bit so that isn't creating tons and tons of devices.

Are your devices getting garbage collected? If they aren't, it may make sense to manually delete them, although this is a bit ugly.

At any rate, I've implemented a pretty minimal amount of error handling in the Cython wrapper. I imagine if I check the error codes being returned by Embree, the issue will become clear. But I will do that once you send a simple example I can test with.

aluo-x commented 3 years ago

Basic system info:

Linux 5.9.4-995.native x86_64 AMD 3960X processor Python 3.8.5 gcc 10.2.1 Cython 0.29.21 Working on providing a minimal example, will take a couple of minutes.

aluo-x commented 3 years ago

I actually did try manually deleting them. It was the first thought that popped into my head.

I've currently tried the following:

  1. Deleting scene, geometry, device at the end of a call
  2. Reimporting the entire object (via .reload)

For reusing the same device, do I need to modify this line in any way? https://github.com/sampotter/python-flux/blob/master/flux/shape.py#L67

aluo-x commented 3 years ago

Specifically in regards to commit, attach_geometry etc.

sampotter commented 3 years ago

Hmm, got it.

When I use that code it's pretty simple:

  1. Create an instance of TrimeshShapeModel (in my case, I will typically only have one (big) trimesh I'm interested in)
  2. Call e.g. check_vis_1_to_N or get_direct_irradiance many times

So, only one device is ever created

If you have a big scene full of many different objects, I would not recommend creating a different TrimeshShapeModel for each of them. I believe Embree has ways of handling a complicated scene while still only using one Embree device, but I haven't learned how to do this since it's outside of my use case. But Embree has many tutorials---it might be a good idea to check their tutorials and see if you can find one that roughly matches what you're trying to do and see how the API is used. If you then want to try to do it using python-embree and find that some of the functions aren't wrapped, I'll be happy to help you out getting what you need.

But in the mean time, please do send the minimal example for this issue so I can at least fix this bug. :-)

aluo-x commented 3 years ago

So my current approach is indeed to create a new TrimeshShapeModel for each object, however it is failing after the creation of 497 such objects. I currently delete each object after using it.

This behavior was surprising, since I initially believed embree to be stateless.

I'm currently working on a minimal example. Will upload in a couple of min :)

aluo-x commented 3 years ago
import numpy as np
import embree
verts = np.array([[-0.08710144, -0.07980939,  0.12013079],[-0.08710144, -0.07885062,  0.12013079],[-0.05561856, -0.07980939,  0.12751411]])
faces = np.array([[0,1,2]])
def intersect(V, F):
    device = embree.Device()
    geometry = device.make_geometry(embree.GeometryType.Triangle)
    scene = device.make_scene()
    scene.set_flags(4)
    vertex_buffer = geometry.set_new_buffer(
        embree.BufferType.Vertex,  # buf_type
        0,  # slot
        embree.Format.Float3,  # fmt
        3 * np.dtype('float32').itemsize,  # byte_stride
        V.shape[0],  # item_count
    )
    vertex_buffer[:] = V[:]
    index_buffer = geometry.set_new_buffer(
        embree.BufferType.Index,  # buf_type
        0,  # slot
        embree.Format.Uint3,  # fmt
        3 * np.dtype('uint32').itemsize,  # byte_stride,
        F.shape[0]
    )
    index_buffer[:] = F[:]
    geometry.commit()
    scene.attach_geometry(geometry)
    geometry.release()
    scene.commit()
    return True

for k in range(1000):
    intersect(verts, faces)

I should correct myself, it actually fails after 994 calls. I noticed my code was actually calling twice.

sampotter commented 3 years ago

Great, thanks.

I have to take off for the day but will fix this when I get back later.

aluo-x commented 3 years ago

Much appreciated!

aluo-x commented 3 years ago

So quick and dirty fix:

make device a global variable, reuse in function. After performing ray casting, release the scene.

I've processed a few thoursand objects, and the results look correct. Only downside is that I'm apprehensive about using this in a multiprocessing case, I suspect weird stuff could happen when multiple processes try to modify the same device object.

aluo-x commented 3 years ago

A more robust fix:

Tested to 4000~ objects with no errors. I don't currently have time to push a fix, but it is a two line addition to the pyx file:

def release(self):
        rtcReleaseDevice(self._device)

which I call after finishing with a TrimeshShapeModel instance.

sampotter commented 3 years ago

Oops, yep, that would be a good thing to wrap. :-) Thanks for catching that. I went ahead and made that change and pushed it.

aluo-x commented 3 years ago

Thanks! Closing this bug now.

aluo-x commented 3 years ago

Leaving a note for other people who may encounter this issue, when processing multiple meshes, you will need to keep track of both the scene & device object in python. After completing any necessary computation, you will need to call

self.scene.release() self.device.release()

This way allows you to reuse a scene & device when performing raycasting w/ multiple different ray instances, and avoids having to rebuild the acceleration structures each time (if you choose to create & destroy the device for every new ray instance, even if the geometry is static).

Edit: typo, missing words

sampotter commented 3 years ago

Thanks for adding your helpful note, @aluo-x. If you're interested, check out https://github.com/sampotter/python-embree/issues/8...