makamys / Neodymium

Reimplements chunk rendering using modern OpenGL features to improve performance (1.7.10)
Other
115 stars 10 forks source link

RPLE and OptiFine compatibility #43

Closed basdxz closed 7 months ago

basdxz commented 7 months ago

Summary

We have implemented full compatibility for OptiFine Shaders and RPLE colored light.

The following configurations have been tested:

Any bugs we could find have been patched, although some external validation would is advised. Flying far off to the world border still works as expected with no noticeable jitter.

Both RPLE & OptiFine extend the vertex stride, with OptiFine shaders having the most significant impact:

(Note that short UVs are currently not supported with OptiFine shaders)

This leads to a significant increase in VRAM usage, as such we may need to increase the default VRAM size depending on the current configuration. Currently, around 1GiB~ was needed in order to load a 16 chunk radius. With the maximum possible allocation being a little under 2GiB, dynamic memory allocation may also be needed in the future.

Technical

FalsePattern commented 7 months ago

Also, the renderSortedRenderers mixin now cancels the method when Neodymium is rendering.

This bypasses all the redundant logic that FalseTweaks does in that method, and lets us avoid any glGenLists/glBeginList/glEndList calls when both Neodymium and the FT occlusion engine are enabled.

FalseTweaks will have the other half of this specific optimization added in 2.8.0, but it depends on the changes added to Neodymium in this PR, otherwise it spams GL errors in the log.

makamys commented 7 months ago

Some more remarks:

But overall it seems good, thanks for implementing this!

PS: I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%] - for comparison, with shaders disabled it's 490->620 FPS [+26.5%]). According to VisualVM, ~80% of time is spent inside nglGetError with Neodymium, and ~88% without.

For testing I used OF E7 + Nd + Sildurs Vibrant Shaders v1.23 Lite.

couleurm commented 7 months ago

hi whats RPLE

FalsePattern commented 7 months ago

hi whats RPLE

https://github.com/GTMEGA/RightProperLightingEngine/

FalsePattern commented 7 months ago

I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%]

Yes, the main difference we noticed during testing is a huge decrease in microstuttering. This compat is mainly useful to reduce those spikes in frametimes instead of getting a huge FPS boost.

the subchunk the player is inside becomes invisible.

This seems to be an odd issue in the vanilla culling logic, FalseTweaks' occlusion engine (with the mentioned patches planned for 2.8.0) seems to fix it. We couldn't pinpoint what exactly causes this unfortunately.

MeshQuad and NeoRenderer are getting huge now

One solution i can propose for this is moving the 4 implementation variants into utility classes. I will add this to the PR in a bit, it should reduce the spamminess of the compat variants by a bit.

basdxz commented 7 months ago

PS: I thought I'd let you know that the gains I get on my setup (GTX1050Ti, 1920x1080, Linux with proprietary drivers) are somewhat modest (48->54 FPS [+12.5%] - for comparison, with shaders disabled it's 490->620 FPS [+26.5%]). According to VisualVM, ~80% of time is spent inside nglGetError with Neodymium, and ~88% without.

OptiFine has a few internal bugs, (some fixed by RPLE or FalseTweaks, with all being eventually migrated to the latter). Optimally with no FPS limit, 50%+ of the time should be spent waiting on Display.update() which is what I have seen in my testing when running with ND/RPLE/OF (RPLE hard-depends on FT).