turanszkij / WickedEngine

3D engine with modern graphics
https://wickedengine.net
Other
5.71k stars 604 forks source link

SSR Rework #80

Closed Kliaxe closed 4 years ago

Kliaxe commented 4 years ago

Hi turanszkij!

So i stumbled upon your engine not too long ago and I got to say, your engine is a huge inspiration! It's well written, and has an amazing engine structure which is easy to understand.

I played a bit around with the build, and wanted to rework the current Screen Space Reflections. The current SSR in the engine is pretty simple, and I wanted to take it a step further. It's another way of doing SSR, and thought if you may be interested :)

turanszkij commented 4 years ago

Hi, That's a great idea! Actually, recently I was looking at implementing the hierarchical Z-tracing approach, but haven't got it working yet. What method are you interested in? For reference, I was reading this on HiZ tracing: https://github.com/greje656/Questions/blob/33aa607f6bb0847331ce09ba56981e30f69920fd/hiz.md It has implementation code too at the bottom. This is using a depth mipchain to skip a lot of empty space. It this engine currently there is a linear depth mipchain (texture_lineardepth_minmax), but they are using regular depth mip chain (only min) I think.

Let me know what you think and thanks for stopping by.

Kliaxe commented 4 years ago

Hi again, and thanks for the quick answer :D

I was looking at an implementation based on 2D marching, from this article from Morgan McGuire and Michael Mara's "GPU Screen-Space Ray Tracing": http://casual-effects.blogspot.com/2014/08/screen-space-ray-tracing.html It uses a "Digital Differential Analyzer" (DDA) line algorithm, with perspective-correct interpolation, which really caught my attention. I played around with their algorithms and ended with something that looks really promising, in the current version.

Edit: If it sounds like something you can use in the engine, I will gladly upload a pull request!

turanszkij commented 4 years ago

Yes, I'm familiar with it. If you feel like integrating it here, feel free to. I would be curious to try it.

Kliaxe commented 4 years ago

Pull request should be up now. Hopefully it should work for you like it did on my setup, without any weird behaviors.

turanszkij commented 4 years ago

Yes, it works perfectly, I only have one minor request before merging, see the pull request. Thanks!

Kliaxe commented 4 years ago

Finally had the time to make the small changes. Pull request should be updated now. Btw, I'm really glad I could contribute to the engine!

turanszkij commented 4 years ago

Merged in #81 Thanks for contributing! I will leave this open for a while. If me or someone else finds a way to optimize it or compare with a hierarchical depth approach, I'll post here.

Kliaxe commented 4 years ago

I've been thinking off some further extension to the SSR, because it's pretty bare bone right now. I've been reading about an implementation from "GPU Pro 5", where they where doing cone tracing, so the rays fade out:

image

Then there is Stachostic SSR, which I think also could be fun to experiment with. It uses microfacet BRDFs, and could be compared nicely to the current PBR system. What's your thoughts on this? Is it something worth implementing?

turanszkij commented 4 years ago

There is one problem, that the SSR computation doesn't have access to the roughness right now. The roughness could be read easily in Deferred, where it is in the gbuffer. But in Forward, we don't have it at the moment. This would need to be handled somehow. Right now both deferred and forward are using a simplified blur, where instead of selecting lower mip in the tracing step, we select lower mip in the sampling step when we sample the ssr texture. But feel free to experiment, or discuss ideas.

Kliaxe commented 4 years ago

Yeah you're completely right! That was another question I had in mind. With the access to roughness eg., we could make even more optimizations to the SSR. An idea that came to me, was to include the features to deferred options only. However, that would destroy the concept of the renderpaths :/

turanszkij commented 4 years ago

You could implement the improved SSR for the deferred paths for starters. Then later it could be done for the forward paths too.

Kliaxe commented 4 years ago

A very quick question. If you were to load an external texture to use for a post-processing effect, how would you do it? Where would you initialize it. Just want to make sure, so I don't make a disorganised mess. A bit off-topic, hope it's okay

turanszkij commented 4 years ago

You could load it in RenderPath3D::Load() or RenderPath3D::ResizeBuffers() for example while you are prototyping. The wiResourceManager can load textures easily.

But one of the design goals of the engine is not having to load any external assets (exception: shaders, and default font file). So in the final version, this should be avoided. What kind of texture do you want? If it is something that can be generated procedurally, the wiTextureHelper should be used to generate it on the CPU, or do it in a shader on the GPU.

Kliaxe commented 4 years ago

My first thought was to use a blue noise texture for some sampling, which could in fact be generated on the CPU. But I think it's better to go with a GPU based noise instead, in this case. If you're curious I'm following EA's presentation of their Stochastic SSR: https://www.ea.com/frostbite/news/stochastic-screen-space-reflections

turanszkij commented 4 years ago

The blue noise sounds interesting, I never used it yet. Feel free to use a texture if you have one handy while you experiment. In a final solution, there are multiple ways to create a blue noise texture, generate on CPU, GPU or use a static array of noise values and create a texture from that (precompiled texture). I know that presentation, I think that might be the best SSR implementation I've ever seen. I'm curious if you can get it to work. :)

Kliaxe commented 4 years ago

Hi again!

I finally implemented the stochastic SSR into the engine! After tons of testing, tweaks and improvements, I think I got something that looks stunning!:

image image image

turanszkij commented 4 years ago

Wow, that looks amazing! Is it implemented for deferred render path?

Kliaxe commented 4 years ago

Unfortunately yes, it's only for deferred options :( I'll work on a pull request tomorrow.

Performance wise you mentioned it was a bit expensive so I tried to optimize it alot this time. The performance became much better, but there is still a missing piece. I saw in their presentation they used tile-optimization based on a roughness threshold. Unfortunately my understanding for compute shaders is a bit vague at the moment, so I couldn't get myself to do it yet. If you want to, I can make a cheap version of the raycast using a fast raymarch cast, then you could perhaps do a tile-classification like you did with the motion blur and DOF if necessary :)

turanszkij commented 4 years ago

No worries, I'll definitely upgrade the forward paths to support exporting the roughness. The roughness might be moved to the red channel in the gbuffer, to let the forward only export one channel. But I'll wait for the pull request first. The tile optimization will probably come later. Good job!

turanszkij commented 4 years ago

Didn't mean to close this. Just pushed a little fix: https://github.com/turanszkij/WickedEngine/commit/ef3f45993a3256d8ba4a592141d6c5a26aae7968 (dispatch dimension was incorrect). I will still upgrade the forward paths to support the new ssr.

Kliaxe commented 4 years ago

That's fine :) I noticed a bug in the code and while fixing it, it revealed the combine pass for the new SSR is completely unnecessary. At the time I saw their presentation, they applied a preintegrated BRDF LUT to the effect, where I though I could simply use a fresnel effect for now and it turns out I was completely wrong xD. I'm going to make a small pull request with some minor improvements, and I also think I'll remove the combine pass too. Since I was using metalness and reflectance in that pass, would that be more challenging to support the forward paths? Because then there is a better reason to remove the pass.

turanszkij commented 4 years ago

Actually, removing the combine pass would make it easier to support in Forward. SSR is combined with other kind of reflections in the object shader for forward, and in the environmentalLightPS in deferred, and in tiledLightCulling for tiled deferred, and that's where fresnel is being applied.

turanszkij commented 4 years ago

Now it works for forward paths too, removed old SSR, and removed mipchain generation from scene texture (actually it was already done before for refractive objects, so ssr reuses that).

Kliaxe commented 4 years ago

While looking at the awesome changes you've made to the engine, I noticed the ssr lost some reflections at the end like it has been pushed upwards (not significantly though). To fix this we could bump the scene copy texture to full-res which is used as input in the ssr (it looks better on smooth surfaces though). I'm not sure if you want to do this, or conserve performance by keeping it at half-res, the decision is yours :)

turanszkij commented 4 years ago

I've turned on the scene mipchain to be on by default (previously it was only enabled as an option and used by refraction), but as you pointed out in your comments, it is quite heavy performance wise. I've optimized the gaussian mipchain generator further, but it is still much faster at half res with not very noticable quality loss in either reflections or refractions. Do you think instead of bumping the mipchain to fullres, instead the falloff for the mip selection could be modified, so that it prefers higher res mip levels?

Kliaxe commented 4 years ago

Alright, I found out why the reflections was behaving like that. It was something to do with the mip that was applied to the input texture in the resolve pass. The mip was calculated like so: clamp(log2(intersectionCircleRadius * max(xPPResolution.x, xPPResolution.y)), 0.0, maxMipLevel) but xPPResolution was the resolve texture resolution not the input resolution, which caused the mip to be incorrect. That should fix the blurring angle of the reflections.

turanszkij commented 4 years ago

Good point. Also the mip count seems to be hard coded for 11. It would be good to upload the mipcount and max of input resolution as constant buffer params. You can leave it to me if you want, or do a pull request.

Kliaxe commented 4 years ago

Yes I was going to say that about the hard coded mip! You can just do the updates, the changes are so subtle :)

turanszkij commented 4 years ago

I think this can be closed. Overall I'm very happy with this effect now, thanks again for the contribution!