feature request: launch3D

LivelyLiz commented 2 years ago

As seen in the documentation on launches, OptiX supports a 3-dimensional launch, but all I could find in owl is launch2D. Is there a specific reason owl is build on 2D launches instead of using all 3 dimensions offered by OptiX?

ingowald commented 2 years ago

Only reason is that I had neither a use case nor anything to test it, and didn't like the idea of throwing in code that I couldn't even run some simple test for. If you do have a use case for it I'd be happy to add; it's easy to do.

ingowald commented 2 years ago

Just for own sanity I just had a brief look: on the host side the changes are literally trivial: in owl/RayGen.cpp the actual launch (in RayGen::launchAsyncOnDevice) is currently taking a vec2i for the dimensoins, and then turning that into a "dims.x,dims.y,1" when calling optixLaunch - so as far as I can tell all that would need to be done is change that "vec2i dims" to a vec3i (including where it gets called), and move this 2D->3D extension to wherever it gets called.

The only (minor) issue I could see is if one also wanted to change the device side helper functions: right now owl::getLaunchIdx() returns a vec3i, and since there's no automatic demotion form vec3i to vec2i that'd break all kind of existing code if we changed that. Suggestion would be to instead add a owl::getLaunchIdx3D for the 3D case, and leave the common case as is. (or not worry about the device side at all, since device code can still just call optixGetLaunchIndex(), which returns a dim3, anyway).

LivelyLiz commented 2 years ago

I currently have a use case for 3D launch, that's the reason I would love to see it added in the first place. I have multiple cameras I would like to get a depth image from. So it would be nice to have the launch index in the form of (pixel_x, pixel_y, camera_id). With the 2D launch I have to do (pixel_id, camera_id) and calculate the pixel's x and y coordinate from the id. Of course it's not impossible to do so, but a 3D launch would make more sense here and is more convenient.

ingowald commented 2 years ago

Ah, I see; easy enough to add, under the assumption that you'll be the guinea-pig to then test it and report whatever bugs I'll introduce :-).

In the meantime: another way of doing the same thing would be the following: a) add your cameraID into the launch parameters b) change your one launch into N launches, each one being async and setting a new cameraID: for (int cameraID = 0 ... N) { owlParamsSet1i(myLPs,"cameraID",cameraID); owlAsyncLaunch2D(rg,size.x,size.y,myLPS); } owlLaunchSync(myLP); Since those launches are all async there won't be any 'cudaDeviceSync's in between them, and all the building and uploading of the launch paramters etc should run asynchronously to the GPU already running the first launch. Thus, the GPU should remain reasoably busy - the launches will still all go into the same cudaThread (they use the same LPs), but should be pretty decent. If you're going for even more high-perf you could also create two (or three) different launch params, have them all contain the same data, and do the above loop but cycle through those two LPs - in that case they'd even be in different cuda threads, so different launches might even run in parallel at the same time.

But - can also just implmenet launch3D :-)

ingowald commented 2 years ago

see branch iw/launch3D , this one does have 3D variants of all launch calls.

Not unexpectedly I spent more time adding a sample/test case than what it took to add the functionality itself, but that's now working, too: samples/cmdline/s10-launch3D takes the RTOW example (s05), and changes the original raygen's for loop over multiple samples per pixel into a 3D launch, where the 3'rd launch index is the sampleID. This sample also now uses CUDA natively for converting the frame buffer from float3 to rbga8. Let me know if this branch works ofr you; if it does I'll merge into devel.

LivelyLiz commented 2 years ago

Sorry it took so long but I finally came around to test the branch for my use case. Works like a charm :) However, the more challenging part was to change my project to use the new modern Cmake structure. I'll make a separate issue with what I had to change to make it work for my case which wasn't mentioned in the Readme. Edit: see #136

owl-project / owl

feature request: launch3D #123