Open 0xFADDAD opened 2 months ago
Thanks for the report. @0xFADDAD, if you are able, would you mind re-running the test with a build from https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 ? This will help determine if this problem is caused by the recent renderer modernization or if it was introduced by something prior. Thanks!
https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 actually managed to work even worse, framerate's are now down in the high 'teens.
I'm glad I decided to try all the levels in the Bedlam set. The last level, 'Polaris', has outdoor areas, but these render mostly correctly, 5 to 10% frame cut or so. 'Plutonium' and 'Apparition' have the severe framerate cuts. I'll try a few more levels with outdoor sections to see if I can make out a pattern.
UPDATE: Dementia's 'Geodomes' is a good test for just how low the framerate can get. The terrain has a few sections of long flat surfaces in the far distance that really crater the performance.
https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 actually managed to work even worse, framerate's are now down in the high 'teens.
Oh wow! I'm glad I asked. We can rule out the modernized renderer as a cause then, at least.
I've experienced the same problem running off of git main builds, I haven't measured the GPU usage through intel_gpu_top
on my NUC but the CPU usage is extremely high and according to htop the brunt of it is kernel time.
Is there a way to profile what's happening and try to narrow down what's causing the thrashing in rendering?
I've experienced the same problem running off of git main builds, I haven't measured the GPU usage through
intel_gpu_top
on my NUC but the CPU usage is extremely high and according to htop the brunt of it is kernel time.Is there a way to profile what's happening and try to narrow down what's causing the thrashing in rendering?
We'll need more precise CPU profiling to identify and mitigate bottlenecks. I recommend running the perf
(record) set of tools on Linux to get precise CPU sampling. Its output can be processed with other utilities to get the biggest time consumers.
I took a quick look on Windows at the beginning of Retribution Level 15. We spent 66% of CPU time in the graphics driver (Intel integrated graphics), 27% in the Windows kernel and just shy of 5% in our own code.
Depending on where I look, I get between 75 to 15 FPS. This correlates to about 1000 to 5000 draw calls and scales pretty linear. The scary part is, that on average we only render 2 triangles per draw call. That is complete overkill concerning the overhead each draw call comes with (state changes, etc). Ideally we would want to batch as much geometry as possible with the same state into a single draw call. Which might be a challenge with the current architecture.
very interesting, indeed we need to optimize draw calls. @InsanityBringer any tips for that?
@pzychotic could you please do same benchmark on 3cb1e8911a1afcc273433db69d843aa51b0203fc revision (before render changes)?
The scary part is, that on average we only render 2 triangles per draw call.
This, in particular, is unsurprising - the D3 renderer is set up in terms of drawing polygons (usually quads), not objects, so if it were to draw a cube for example it would perform eight g3_DrawPoly
calls: One for each side of the cube. We need to transform the renderer so that it thinks primarily about drawing objects, but doing this transformation will require "lifting" the draw operation up to each callsite of g3_DrawPoly
- about 65 callsites. Not prohibitive, but not a light job either.
could you please do same benchmark on 3cb1e89 revision (before render changes)?
Interesting changes, we spent alot more time in our own code and not much in the Windows kernel while graphics driver was a bit less.
I don't really have a good solution for the legacy renderer. Terrain is the worst because it adds up to the worst of everything. Expensive objects, expensive rooms, the terrain triangles themselves all in a very open environment not conclusive to culling doesn't help but during my attempts to improve legacy in Piccu I found that actually drawing the terrain itself is probably the smallest cause of lag (though the vastly increased limits of the terrain renderer in 1.5 aren't helping in the slightest)
To some degree, pursuing things like stripification of polygons could lead to some gains, but I feel at that point, you're better off pursuing a meshing solution using newer (even OpenGL 2 era) features like GPU-side vertex buffers.
you're better off pursuing a meshing solution using newer (even OpenGL 2 era) features like GPU-side vertex buffers.
I actually agree with this because even the VBO implementation in UA_source really sped up rendering there, and that's already a low poly game.
Kind of late, and might already be obvious to some, but I forgot the Fusion engine was two engines in one. So I went looking for the game's post-mortem and found an interesting excerpt from Jason Leighton, one of the programmers.
"The terrain engine actually began as a prototype for another game that Jason was interested in developing. Unfortunately, Bungie’s Myth beat us to the idea, but the terrain technology was solid enough to be incorporated into Descent 3. It was based on a great paper by Peter Lindstrom and colleagues entitled Real-Time, Continuous Level of Detail Rendering of Height Fields (from Siggraph 1996 Computer Graphics Proceedings, Addison Wesley, 1996). Of course, it was bastardized heavily to fit the needs of Descent 3, but the overall concept was the same — create more polygonal detail as you get closer to the ground and take away polygons when you are farther away. After implementing the real-time LOD technology, our frame rates quadrupled."
Perhaps the LOD scaling is broken or non-functional after many of the limits had been expanded? Might be worth investigating.
Perhaps the LOD scaling is broken or non-functional after many of the limits had been expanded? Might be worth investigating.
Well; the LOD scaling is definitely doing something in release 1.5 but it's still a lot of draw calls...
Try setting the "Terrain Detail" slider all the way to the lowest setting. Though I don't recall off the top of my head how 1.4 behaved.
Took your advice and tried ticking down the slider, 28 being max, I tried 27 with not much, but some improvement, but 26 seems to be a huge improvement. It's a solution, but the common reasoning would be, "this is a 25- year old game, it 'should' run completely maxed out", but if we're increasing the max polycount beyond what the engine is capable of putting out, it might just be best to leave limits where they were.
The path to rendering optimization is probably going to involve gutting the engine down the middle; as stated above:
the D3 renderer is set up in terms of drawing polygons (usually quads), not objects, so if it were to draw a cube for example it would perform eight g3_DrawPoly calls: One for each side of the cube. We need to transform the renderer so that it thinks primarily about drawing objects
Modern OpenGL and Vulkan a lot of stuff is carried on in the GPU rather than the CPU too, so we need to have less CPU bound rendering code. It's not going to be a very easy task I imagine...
I wouldn't have a clue how one would; for example: have the GPU do the procedural textures in hardware versus it happening in the CPU and the code constantly uploading a new texture every frame.
Build Version
2db85ca6ecd46ff31035f59cd409477e81d4e0e7
Operating System Environment
CPU Environment
Game Modes Affected
Game Environment
No response
Description
Looking at skybox with no terrain in view or returning to indoor areas returns framerate to normal. Possible terrain is not being culled?
Regression Status
No response
Steps to Reproduce
Enter outdoor area, framerate halves and GPU usage triples.
https://github.com/user-attachments/assets/4c99ac6a-5288-47c6-8161-175733dad533