Same as the current version in terms of graphic API calls.
[C#] Create temporary RT
[C#] Blit source texture to temporary RT
[C++] CopyResource temporary RT into Spout's DX11 shared texture (this happens on the native side now)
DX11 Receiver
Now performs an extra CopyResource operation that the current version doesn't do.
To get rid of it, code paths for DX11/DX12 would need to be split on the C# side which I decided against for simplicity's sake. But it's on the table.
[C#] Create temporary RT
[C++] CopyResource Spout's DX11 shared texture into temporary RT
[C#] Blit temporary RT into
DX12 Sender & Receiver
Same as the DX11 code, with a few differences:
Cannot give Unity a pointer to Spout's DX11 shared texture because it's not a DX12 texture. This is why the texture pointers are now instead shared in the other direction, from Unity into the native side.
DX12 textures passed into the native side need a DX11 wrapper so that they can copy to/from Spout's DX11 shared texture. (Using the D3D11on12 feature)
That wrapper needs to be created/destroyed according to its DX12 texture's lifetime
That wrapper needs to be acquired/released before using the texture in the DX11 context
Maintains an eviction cache to approximate the lifetime of the textures, since in the case of temporary RTs we don't know their lifetimes.
Uses IssuePluginCustomBlit to pass textures from Unity to the plugin so that the temporary RT native texture pointers can be resolved. We can't obtain the native texture pointers of CommandBuffer temporary RTs on the C# side.
If support of temporary RTs in DX12 is deemed unnecessary, these two things can be removed, simplifying the plugin somewhat. But this option would also have a gotcha (can elaborate).
Eviction Cache
If Unity releases a texture before our cache evicts it, it will log a scary warning (not in the console, only in the log file) because its ref count is still not zero.
d3d12: releasing a resource which is still being referenced. It will be leaked.
That's fine - our cache will evict and fully release the texture moments later.
The cache could be made more robust to handle these cases (ex subscribing to play/stop/sceneloaded events to evict all its contents).
The warning message would also go away if DX12 temporary RT support is removed from the PR.
Tested
Unity 2020.3.9f1 and Unity 2020.2.1f1
Visual Studio 2017 compiler for native plugin
Performance Tests
2020.3.9f1
Native plugin compiled in release mode (as it always should be outside of dev/debug)
Render Thread stats are from the Stats overlay. GPU stats are from running the Profiler with the GPU Usage module (and reading the total GPU frame time).
Stress scene
Disabled scene reloading to get a stable reading.
DX11 (master branch) 7.2 ms Render Thread 10.0 ms GPU
DX11 (this PR) 7.5 ms Render Thread 10.0 ms GPU
DX12 (this PR) 18.2 ms Render Thread 20.0 ms GPU
If I keep the scene reloading, then I can estimate that scene load time between DX11 (master branch) and DX11 (this PR) is roughly the same, while DX12 (this PR) is two or three times slower.
Quad scene
In the receiver object, receiving a 1080p texture from another app.
DX11 (master branch) 2.0 ms Render Thread 1.2 ms GPU
DX11 (this PR) 2.0 ms Render Thread 1.5 ms GPU
DX12 (this PR) 2.0 ms Render Thread 3.0 ms GPU
Conclusion
I haven't been able to figure out why performance on the DX12 path is x2 worse. There's for sure a cost associated with the D3D11on12 layer but I don't know if that explains everything.
Features
Overview
DX11 Sender
Same as the current version in terms of graphic API calls.
DX11 Receiver
Now performs an extra CopyResource operation that the current version doesn't do.
To get rid of it, code paths for DX11/DX12 would need to be split on the C# side which I decided against for simplicity's sake. But it's on the table.
DX12 Sender & Receiver
Same as the DX11 code, with a few differences:
References Used
Temporary Render Targets
To support temporary RTs, the PR does two things:
If support of temporary RTs in DX12 is deemed unnecessary, these two things can be removed, simplifying the plugin somewhat. But this option would also have a gotcha (can elaborate).
Eviction Cache
If Unity releases a texture before our cache evicts it, it will log a scary warning (not in the console, only in the log file) because its ref count is still not zero.
d3d12: releasing a resource which is still being referenced. It will be leaked.
That's fine - our cache will evict and fully release the texture moments later.
The cache could be made more robust to handle these cases (ex subscribing to play/stop/sceneloaded events to evict all its contents).
The warning message would also go away if DX12 temporary RT support is removed from the PR.
Tested
Performance Tests
Render Thread
stats are from theStats
overlay.GPU
stats are from running theProfiler
with theGPU Usage
module (and reading the total GPU frame time).Stress scene
Disabled scene reloading to get a stable reading.
If I keep the scene reloading, then I can estimate that scene load time between DX11 (master branch) and DX11 (this PR) is roughly the same, while DX12 (this PR) is two or three times slower.
Quad scene
In the receiver object, receiving a 1080p texture from another app.
Conclusion
I haven't been able to figure out why performance on the DX12 path is x2 worse. There's for sure a cost associated with the D3D11on12 layer but I don't know if that explains everything.