Try / OpenGothic

Reimplementation of Gothic 2 Notr
MIT License
1.16k stars 85 forks source link

Virtual shadowmap #681

Open Try opened 2 months ago

Try commented 2 months ago

In general, I'm looking for a new, more sophisticated, solution to replace projective shadowmap.

Goals

Known solutions

Not a lot actually... nanite (page 119): https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf assassins creed (page 55): https://advances.realtimerendering.com/s2015/aaltonenhaar_siggraph2015_combined_final_footer_220dpi.pdf https://ktstephano.github.io/rendering/stratusgfx/svsm https://www.cse.chalmers.se/~uffe/ClusteredWithShadows.pdf https://www.gamedevs.org/uploads/efficient-shadows-from-many-lights.pdf

not vsm, but close enough in concept: http://lukaskalbertodt.github.io/2023/11/18/tiled-soft-shadow-volumes.html

Very first WIP

изображение изображение

Some page-data to illustrate изображение

Initial implementation

Decide to start with common parameters for now:

Current considerations:

  1. Cluster culling: cull all clusters versus all pages is expensive 1.1 Coarse culling not possible - need to output exact pageId, for visible meshlets 1.2 Output size is not deterministic and no good way to react to out-of-memory
  2. Cull versus clip-map (HiZ like) is easier, than versus each page individually 2.1 Won't be able to use hw-rasterizer to output data (need to use image-atomics + image-less rendering) 2.2 Image-less rendering limited by maxViewportDimensions = 4k
  3. Software renderer?! 3.1 Immediate one still requires atomics, and wont be better than render-pass based one 3.2 Tile-base can be an interesting take, but not valid without bindless
  4. Requires some complementary solution to work with volumetrics
YALdysse commented 2 months ago

@Try, Does this mean OpenGothic will work on weaker graphics cards ?

Try commented 2 months ago

@YALdysse by 'weaker' graphics card you mean weaker than what?)

YALdysse commented 2 months ago

@YALdysse by 'weaker' graphics card you mean weaker than what?)

OpenGothic consumes an average of 95% of my graphics card - AMD Radeon RX Vega 7 (CPU - AMD Ryzen 5500U).

YALdysse commented 2 months ago

OpenGothic consumes an average of 95% of my graphics card - AMD Radeon RX Vega 7 (CPU - AMD Ryzen 5500U).

Unfortunately, we are not talking about stable 60 FPS, but I can play.

Try commented 2 months ago

OK, I would like to avoid setting any expectations, as virtual-shadow is pretty-much experimental tech.

AMD Radeon RX Vega 7

With Vega there are 2 major issues:

Try commented 2 months ago

Some numbers [RTX3070]:

Protective shadowmap (current solution):

Virtual shadow

Numbers for one of smaller pages, close to the camera: изображение

Try commented 2 months ago

clip-distance helps with fragment workload, bringing adequate FPS (still slower than regular SM)

изображение

Try commented 2 months ago

Some rendering examples: изображение изображение изображение

Try commented 2 months ago

City: изображение

Ship: изображение

Try commented 2 months ago

Even with large pages look is not good:

vsm.header.pageCount    604
vsm.header.meshletCount 91656 // total to draw for VSM
vsm.header.counterM 10787 // total amount of unique meshlets for VSM, meaning duplication factor is ~x9
vsm.header.counterV  8733 // meshlets that a drawn in case of good-old shadowmap

non-empty shadow mips:  8

Testing area: изображение

Non related, just cool screens: изображение изображение изображение

Try commented 2 months ago

Testing is now enabled by command line: -vsm 1

Try commented 1 month ago

some data on culling:

// baseline
vsm.header.pageCount    546
vsm.header.meshletCount 38316
vsm.header.counterM 13753
vsm.header.counterV 8733

// cull dummy tiles in larger page
vsm.header.pageCount    560
vsm.header.meshletCount 34514
vsm.header.counterM 0
vsm.header.counterV 8733

// hiz
vsm.header.pageCount    558
vsm.header.meshletCount 29547
vsm.header.counterM 0
vsm.header.counterV 8733

About 23% reduction in meshlet count. Now runtime is about 1ms for rendering.

vsm.header.counterV 8733

Amount of meshlets for protective shadow-map. So we still using about 4x meshlets, relative to projective sm

Try commented 1 month ago

fog + vsm doesn't quite work (non-resident pages): изображение

Try commented 1 month ago

experimenting with vsm-fog: изображение

Try commented 1 month ago

Some frame case-study on draw-amount. (note: I do prefer to measure amount of meshlets, instead of draw-time as it doesn't depend on gpu-model/temperature/etc)

projectiveSM: 8.7k virtualSM: 20.6k

Now breakdown per clipmap:

mip:  all, land, obj
 0  1041   590   455
 1  1516   784   732
 2  1263   603   660
 -
 3  2152   996  1156
 4  2516  1238  1282
 5  4731  1016  3715
 6  3900   865  3035
 -
 7  3148   740  2408
 8  156     82    78

Chart Title(1)

Fog adds roughly another 5k to mip5 (and a bit to others)

Try commented 2 weeks ago

vsm: epipolar fog in progress

fog is back to reasonable timings: 1.51ms -> 0.57ms. However comes at cost of some flicker, that need to be worked on

Try commented 1 week ago

L'Hiver. Nice details on individual rocks, ~30% gpu load - similar to vanilla game.

изображение