EmbarkStudios / kajiya

💡 Experimental real-time global illumination renderer 🦀
Apache License 2.0
4.92k stars 181 forks source link

Add NSI C API #35

Closed virtualritz closed 2 years ago

virtualritz commented 2 years ago

Is your feature request related to a problem? Please describe.

Sending 'live' data from a DCC app (Maya, Houdini, etc.) to kajia. NSI is an open, node-based C API for DCC apps to communicate with offline renderers. It was inspired by/supersedes the RenderMan API in the 3Delight renderer.

I would be interested in working on this/contributing/helping out. There is a Rust wrapper for NSI (that I maintain) but this is not relevant here (see below).

Describe the solution you'd like

Add the NSI hooks to populate/edit the scene graph of kajia and expose these as a C binding.

This would allow sending data from DCCs to kajia via existing OSS plug-ins that use this API (see below).

Describe alternatives you've considered

There are no good alternatives. Something like USD, API wise, is a total PITA in comparison.

On a sidenote, as there is a USD Hydra delegate for NSI, adding NSI support to kajia would also mean hooking the renderer into USD via Hydra as a side-effect.

Additional context

There are OSS plug-ins that use NSI for Houdini, Maya, Katana and Cinema4D. These would require minimal changes (mostly shaders-related as NSI is built around the idea that a renderer has OSL support) to work with a 3rd party NSI implementation.

h3r2tic commented 2 years ago

That is... an interesting proposition 😅 I can imagine how a geometry pipe could work, but the shader part is troubling. It sounds like that would require an intimate connection with OSL -- do you have any ideas how that could look like?

The current material model in kajiya is extremely basic, with a Lambertian diffuse + one-layer multi-scatter GGX for spec, and inputs covering exactly what those BRDFs need via hardcoded textures and multipliers. I have been thinking of extending it at some point to something along the lines of Autodesk's Standard Surface plus MaterialX, but OSL sounds a bit more scary with its flexible radiance closures. That being said, I have not used OSL in practice, so you'll probably have a better idea!

virtualritz commented 2 years ago

An OSL shading network is shaders with input and outputs that can be linked together into a DAG graph ending a final node whose outputs then gets attached to geometry. So you would e.g. have a (Disney) Principled or (Autodesk) Standard Surface shader node and some shader nodes that read and transform a texture or generate a procedural pattern etc. I.e. you average node-based shader building system expressed in a spec.

From the NSI pov the only strong idea that was 'imported' from OSL was that there are no dedicated light sources. There is just geometry with a shader emitting light attached. This is the biggest disconnect NSI usually has from real time renderers. Reading between the lines of your Medium blog post kajia seems to already take this approach so it may not be an issue?

Otherwise adding shader it is just an assignment of a shader node to a "surfaceshader" or "displacementshader" slot on certain nodes. I.e. the source node just has attributes referencing a shader file. And that could be HLSL or SPIR-V or whatever.

I was thinking it would be nice to have the ability to describe a shader DAG with NSI and the nodes are actually blobs of Rust shader code instead of OSL. There would be zero API changes needed on the NSI side.

virtualritz commented 2 years ago

In any case the NSI guidelines are a quick read to understand how this API works.

I also had an interesting discussion with a real time rendering engineer (@bazhenovc) on Discord, earlier this year, on how to map NSI outputlayer nodes to typical real time render passes (and their combination(s), to get final pixels). I will try to dig that out and get Baz to join this discussion.

bazhenovc commented 2 years ago

Oh hello, didn't expect to be invited to this discussion :)

NSI by itself is fairly straightforward and maps to render graphs relatively well, you've already mentioned that. Without OSL support it's going to be somewhat less useful.

Regarding OSL itself - I don't see any fundamental issues that would stop a certain subset of OSL. There are some things in the language/standard library that assume ray tracing (i.e. int raytype (string typename)) and things that doesn't really map to the GPUs (string manipulations and printf()) but apart from those it should work? In the context of kajiya assumed ray tracing is not an issue.

There are several implementations that convert OSL -> GLSL, for example pyosl. Converting compiled .oso bytecode to SPIR-V or DXBC is also possible, even preferable.

After that it can operate as a regular node-based material/material graph system.

That being said, supporting OSL is a huge chunk of work, so whether it's worth it or not is up to you or whoever will make a commitment to implement and keep supporting it :)

virtualritz commented 2 years ago

Nah, I do not think this is worth it TBH. The main reason for this request is gaining the ability to send geometry to the renderer from DCCs.

And for Houdini (and Solaris, too, soon) the NSI plugins already support rendering to the viewport.

I.e. you could have kajia rendering the Houdini viewport, live. I find that a tad exciting. 🤭

h3r2tic commented 2 years ago

Thanks for all the info -- was fun reading through some of the references!

As a quick note about the light handling in kajiya -- it currently doesn't really have any. I've started work on explicit triangle emitters to be used in Next Event Estimation; right now it's just one directional light (sun), and emission without NEE -- meaning that small and bright emitters result in a lot of variance. I don't want to go the standard game renderer path of "point lights" with limited falloffs and so on, so the idea from OSL around emission closures sounds way more enticing. There would need to be heuristics to decide which emitters get included in NEE, as there would be some some cost associated with that.

Getting back to the main topic though: supporting a good chunk of NSI and/or OSL sounds like a lot of work. There's a certain subset there which would be interesting, but some bits scare me 😅 Now, I haven't used OSL or NSI myself, so I could be miles off here, but here's some of my immediate scares:

As a more general thing, supporting general shader networks / material shaders is something of a self-punishment done by real-time rendering engineers. Not having per-material shaders means the renderer is very restricted, and probably doomed to forever remain a toy. Having them means an incredible amount of pain managing shader compile times and performance, juggling pipeline state objects, and enormous ray-tracing shaders (since those link-in all the shaders that could be hit by rays). Oh, and state sorting and complex material parameter plumbing and so on...

All of that being said, certain parts of NSI could be useful -- being able to directly manipulate the scene in a DCC package means instant feedback, and that one wouldn't need to edit RON/JSON files to move objects around 🙄

So I guess one question is whether it would make sense to support a restricted subset of NSI around scene manipulation.

In the longer term, I do want to investigate some sort of programmable surface shading too, but ideally in a way that doesn't make me regret past life choices 😅

virtualritz commented 2 years ago

The restricted subset is not API related. You simply do not support nodes of type "shader". Or likely: they get replaced with something like a standard shader you can define on a global.

Output: I started a discussion on the NSI Slack about how to model combination of passes (aka AOV aka outputlayers) as a node graph. This is possible by just adding resp slots to "outputlayer" nodes.

It eventually died as all the people on that Slack are offline rendering/VFX folks. But there were some interesting ideas also on the discussion with @bazhenovc on Discord.

That is probably another interesting thing and a much lower hanging fruit than OSL. Again: why not replace OSL with e.g. Rust shaders or GLSL or whatever for the time being?

h3r2tic commented 2 years ago

Yup, for the time being, replacing OSL with Rust or HLSL would be perfectly fine.

Regarding output / AOVs / layers, what are some use cases on your mind?

bazhenovc commented 2 years ago

A good use-case for AOVs is baking out various expensive simulation or rendering parameters, here's an example with baked volume transmittance (2 textures, 6 directions, 1 direction per channel) that implements fake, fast and convincing cloud rendering.

https://twitter.com/Vuthric/status/1286796950214307840

image image

h3r2tic commented 2 years ago

Thanks, @bazhenovc! That's a pretty awesome use case for an offline renderer -- but I wonder how that would be useful for a real-time one 👀

Some of this is just multi-viewport rendering (so quite feasible), but having each viewport be affected by different lights (for example) might complicate the scene graph a bit.

bazhenovc commented 2 years ago

That's a pretty awesome use case for an offline renderer -- but I wonder how that would be useful for a real-time one 👀

I thought the whole discussion was about the data flowing from DCC to the real time renderer :) NSI can enable the tech art to get whatever data they want directly from any offline renderer that supports it without having to write a dedicated plugin.

The other way around is not particularly useful, maybe for cross-validation with multiple different renderers? For instance export just the specular lighting from the real time renderer and compare it to what Houdini/Blender/etc produces.

bazhenovc commented 2 years ago

The other way around is not particularly useful, maybe for cross-validation with multiple different renderers? For instance export just the specular lighting from the real time renderer and compare it to what Houdini/Blender/etc produces.

It is not particularly useful for DCC itself, but having custom AOVs can open up more options in the realtime renderer itself. AOVs are essentially just your regular custom render passes that generate arbitrary data.

My memory is a bit hazy at this point, but I think what we discussed was more about having a "generalized data-driven render pass system" that can be expressed with NSI, in-engine API and maybe some sort of a visual editor, like render graphs on steroids.

BRDF LUTs for IBL that are usually generated with a compute shader could potentially be expressed with NSI/this render graph system that executes once on startup.

Tech art can implement these passes to generate some fancy data they might need, for example a procedural road system could be implemented that blends road material on top of some terrain along a curve driven by gameplay logic.

Things like that are usually a dedicated low-level system that constantly needs maintenance and engineering time, but could have been a data driven system maintained by tech art.

Also worth noting that the runtime performance should still be fine - the actual GPU workload will be the same in both cases since both systems express the same algorithm and generate similar command buffers, we will just have a small overhead similar to what regular render graphs usually cost.

Niagara in UE already has something like this - you can setup custom render targets and passes without leaving the editor and render arbitrary stuff there, including but not limited to the scene from a different camera. Although it's limited to particles, but already looks impressive.

virtualritz commented 2 years ago

It seems there are three topics here:

  1. Scene graph description via NSI for geometry in kajia (what I had in mind when opening this ticket).
  2. Mapping something like an OSL node graph to something a GPU can understand.
  3. Mapping render pass (realtime) and how those get combined to AOV/outputlayer (offline/NSI).

Maybe we limit this to what the hurdles are for 1. for now here? 2. & 3. could be discussed in their own issues?

aghiles commented 2 years ago

@h3r2tic, I agree that some of the features of NSI can be skipped at least for a first implementation. The inter object visibility needs some intense work and I am actually thinking about releasing the algorithm for how to do it efficiently (in O(1) in most cases).

That being said, I highly recommend you to use NSI. NSI will put a fundamentally sound foundation under your rendering core. If you decide not to, one advice I can give you: use a hierarchal structure for your geo, not a flat structure.

If you decide using NSI, we will be available to help you on our slack channel.

Good luck.

h3r2tic commented 2 years ago

@virtualritz: I'm perfectly fine with limiting this to 1. -- I didn't understand the scope of what you were proposing here 😅

@aghiles: Thanks! Now before you run off (it seems like you want to run off 👀), may I ask: what kind of hierarchy do you mean? One can have transform hierarchies, spatial clustering for culling, instance-based grouping (for batching), chunks for streaming and large world management, ... Then the geo itself can be chunked up into meshlets for culling, and so on.

aghiles commented 2 years ago

I wish I could run but it is too late now! haha So, NSI is disconnected from hardware considerations and from algorithmic considerations. It is a more high level description. So what I meant is the transform hierarchy. Your renderer will have to build the spacial hierarchy (BVH) as needed and that one is disconnected from the scene description semantics.

virtualritz commented 2 years ago

@aghiles, are you planning to make your NSI stream parser OSS soonish?

h3r2tic commented 2 years ago

@aghiles Please correct me if I'm missing something, but I don't believe the renderer needs a transform hierarchy -- that would be up to something using the renderer. A BVH is indeed needed for ray-tracing, but that's managed by Vulkan and graphics drivers. kajiya specifically aims to be just a renderer without any application logic. NSI support would also be in a layer/crate on top, not in the core.

virtualritz commented 2 years ago

@h3r2tic

[...] I don't believe the renderer needs a transform hierarchy [...]

The renderer does not need one. But the layer implementing NSI needs one as NSI allows editing a scene.

aghiles commented 2 years ago

@h3r2tic actually the renderer need one ! :) This is a common misconception. I give you an example. You have a robot with 1 million transforms under one main transform. In a live rendering session (IPR), the user moves the the upper transform. What happens in a renderer with a flat hierarchy ? You will have to set 1 million matrices right ?

Flat is easy and simple and renderer programmers tend to convince themselves that it is exactly what they need ! :)

My advice is to go for the implementation you find easier for you in the current context, but keep what I said in mind.

h3r2tic commented 2 years ago

Hah, that's... a rather extreme case :P If the million transforms is for bones, those could still be encoded in parent-relative space. If the million transforms is for individual objects, then updating matrices would be the least of my worries xD In either case, building/refitting BVHs for ray tracing would easily kill performance way earlier -- and those are rigid 2-level structures in Vulkan/DXR.

Flat is easy and simple and renderer programmers tend to convince themselves that it is exactly what they need ! :)

There's also plenty of evidence of the contrary -- contrived scene graphs from which it's impossible to get performance. In the end it's always about profiling and designing for one use case vs another. At Frostbite, we had to get rid of a hierarchy in occlusion culling for example, as the structure was interfering with SIMD and cache-efficient processing.

aghiles commented 2 years ago

This is just one example. In my line of work, people work with hierarchies and they will do things with hierarchies. For example, light linking, inter object visibility, deleting, adding objects etc. All those things are encoded in hierarchies.

I am not sure what you mean about "contrived" hierarchies but if you have trouble extracting performance from this data structures then yes, I would recommend not to go there.

P.S. Of course, 20 years ago everything was flat in 3Delight so I have good experience with both implementations ( it was also SIMD).

hrydgard commented 2 years ago

Hierarchies are just fine, but the low level rendering layer doesn't need to know about them, and you should probably treat Kajiya as one. You can build your own hierarchies on top as much as you want, and just every frame bake them out to a linear bunch of lists in the end, which is not likely to be your bottleneck. There's no performance need for the renderer to know about parent transforms etc.

aghiles commented 2 years ago

Yes, the low level renderer really needs to know about them. Exactly like a a shoemaker needs to know that there is a left leg and a right leg. But I don't think we will agree on that :)

h3r2tic commented 2 years ago

This is just one example. In my line of work, people work with hierarchies and they will do things with hierarchies. For example, light linking, inter object visibility, deleting, adding objects etc. All those things are encoded in hierarchies.

And I appreciate those use cases, and recognize them as perfectly valid! However most of those, especially light linking and inter object visibility are not things that this renderer is going to pursue (for reasons of light transport I mentioned earlier in the thread). That would all be important for using the renderer as a plugin in a DCC app (or for offline rendering), but kajiya is aiming more in the general direction of video games and consistent real-time frame rates.

I am not sure what you mean about "contrived" hierarchies but if you have trouble extracting performance from this data structures then yes, I would recommend not to go there.

I was thinking scene graphs like Open Scene Graph xD

  1. dump everything into the same structure
  2. waste tons of cycles filtering stuff out of the structure

(Not implying that you're suggesting such a thing)

P.S. Of course, 20 years ago everything was flat in 3Delight so I have good experience with both implementations ( it was also SIMD).

Hah, I'll be perfectly happy to revisit the scene structure in 20 years :P FWIW, scene management in kajiya will most likely move entirely to the GPU at some point, and that will have its own requirements and tradeoffs. Too early to tell what's a good approach right now.

virtualritz commented 2 years ago

I think the whole hierarchy stuff belongs into an nsi implementation layer.

And there should be low cost ways to get a flat representation for a rendering backend like kaija that are updated by that layer.

I.e. a first step is probably a reference implementation of that nsi layer in a separate crate. I'm interested in this, just need to carve out time.

virtualritz commented 2 years ago

[...] but kajiya is aiming more in the general direction of video games and consistent real-time frame rates.

The three main use cases I had in mind when I opened this ticket:

  1. Previewing assets – in the viewport not least – from a DCC app in the target renderer for a game.

  2. Creating cut-scenes in a DCC that are to be rendered in real time, using the game's engine.

  3. Playblasts for VFX production for animation/blocking/shot composition that already include lights, textures and reflections.

Refractions missing is not such an issue even for 3. imho. Shots where a refraction becomes part of a shot's composition, e.g. stuff seen through a magnifying glass or the like are rare and seldomly done in comp anyway.

virtualritz commented 2 years ago

The fourth use case is simply testing the renderer itself. I.e. being able to throw all kinds of stuff at it quickly w/o writing/integrating tons of parsers for different 3D asset formats.

For example, generating a ton of scene complexity is child's play in a DCC like Houdini.