NVIDIAGameWorks / Streamline

Streamline Integration Framework
Other
377 stars 79 forks source link

Streamline destroys NGX context, while it's in use on gpu. DLSS #36

Closed QDanteQ closed 5 months ago

QDanteQ commented 5 months ago

Hi, Shouldn't the streamline handle the NGX context life time, as he owns it. When switching dlss quality mode or output resolution the streamline destroys previous ngx context and creates new one without validating that resources are not in use on gpu. I don't see any mentions in programming guide that user have to test whether the quality mode or output resolution changes and flush the gpu before calling slEvaluateFeature(sl::kFeatureDLSS, ..);

Am I missing something? Should we really manage the ngx handle lifetime, without even having proper access to it? I would appreciate if someone clarify it to me.

Thanks, Daniel.

jake-nv commented 5 months ago

sl.common plugin manages the lifetime of NGX as a whole (see https://github.com/NVIDIAGameWorks/Streamline/blob/main/source/plugins/sl.common/commonEntry.cpp#L1099 and https://github.com/NVIDIAGameWorks/Streamline/blob/main/source/plugins/sl.common/commonEntry.cpp#L1343), the sl.dlss plugin manages the lifetime of the DLSS-SR "feature".

While not exactly what you're asking, https://github.com/NVIDIAGameWorks/Streamline/blob/main/docs/ProgrammingGuideDLSS.md#80-multiple-viewports shows an example of having multiple viewports and synchronizing with "in-flight" GPU work.

Recreating the NGX feature on resolution changes is part of standard usage of DLSS-SR because it is initialized with input and target resolutions. For details regarding DLSS-SR and NGX see the DLSS-SR Programming Guide. Section 5.5 mentions synchronizing with feature release.

QDanteQ commented 5 months ago

Right. Sorry for the confusion, I was talking specifically about lifetime of DLSS-SR "feature". Yes, I understand that we need to reinitialize it every time the quality mode or resolution changes, but when we had direct access to this dlss feature context, we could postpone the featureRelesase (deinitialization) to the time when the gpu isn't using it and just create and use another dlss feature handle. I guess we can simulate this with multiple viewport, but I'm not sure I can use multiple viewport solution, if I plant to use DLSS-G, as programming guide says that it support only one viewport.

Then the only option if I want to use DLSS with DLSS-G, would be to have a cpu-gpu synchronization, every time the target resolution or dlss quality option is changed. Is that right?

jake-nv commented 5 months ago

It depends on your engine and when/how you handle things. But you're correct that you must ensure GPU resources aren't destroyed whilst in-use/in-flight/etc.

Note that DLSS-FG has its own requirements regarding e.g. resolution changes: https://github.com/NVIDIAGameWorks/Streamline/blob/main/docs/ProgrammingGuideDLSS_G.md#120-dlss-g-and-dxgi

The Streamline sample is an example of integrating both SR and FG in the same application.

QDanteQ commented 5 months ago

I think it can be mentioned in programming guide. I see that Streamline sample also can have a gpu crash when switching dlss quality modes, as it does not have cpu-gpu synchronization on quality mode switches. For test, I'm switching every frame dlss quality mode in code, then I run sample, enable "Developer Menu" and set "Dynamic" in "DLSS Resolution mode" with imgui, so sample won't recreate render passes every frame and after few frames I observe a gpu crash. I even enabled D3D12DebugLayer through slInterposer and got

D3D12 ERROR: ID3D12Resource2::: CORRUPTION: An ID3D12Resource object (0x000001F6AEE83100:'dec_4_1_weights_nchw8) is referenced by GPU operations in-flight on Command Queue (0x000001F6957D0150:'Graphics Queue'). It is not safe to final-release objects that may have GPU operations pending. This can result in application instability. [ EXECUTION ERROR #921: OBJECT_DELETED_WHILE_STILL_IN_USE] [13.03.2024 12-44-14][streamline][info]dlssentry.cpp:358[dlssBeginEvent] Detected resize, recreating DLSSContext feature Exception thrown at 0x00007FFF8801CF19 (KernelBase.dll) in StreamlineSample.exe: 0x0000087D (parameters: 0x0000000000000000, 0x00000056C42FB430, 0x000001F69329D020). A breakpoint instruction (__debugbreak() statement or a similar call) was executed in StreamlineSample.exe.

So when using dlss without streamline, the user is the one, who handles lifetime of dlss ngx feature. But with streamline, the user does not have a direct access to ngx dlss feature and obviously he does not handle its lifetime. The issue is, that user does not have clear information about when the streamline will delete/release the ngx feature. Meaning the user doesn't know when he needs to insert cpu/gpu synchronization. I don't think that it's a good idea to do it every frame.

jake-nv commented 5 months ago

Seems there's a number of issues with the sample, looks like some will be getting fixed in the upcoming release.

Yes, I believe it is expected that synchronization, if required, will be part of the usual renderer recovery when e.g. changing resolution.