diasurgical / devilutionX

Diablo build for modern operating systems
Other
8.02k stars 786 forks source link

Bake lighting into dungeon CELs at load time #6631

Closed glebm closed 1 year ago

glebm commented 1 year ago

Original Diablo pre-renders up to 128 frames / 1 MiB of dungeon CELs with baked lighting on load.

In the original, it is created by MakeSpeedCels. The baked-lighting CELs are stored in pSpeedCels and are indexed via SpeedFrameTbl. CELs that appear more often on the map have priority when baking (SortTiles). For the baked CELs, level_cel_block |= 0x8000 flag is set.

Baked lighting cache size should probably be a compile-time option (completely disabled when 0).

We do better than the original in a few aspects:

  1. SortTiles was bubble sort (4 million iterations), we can just use std::sort or std::priority_queue.
  2. Rather than limiting on RAM and 128 frames, we can limit on RAM alone and allocate SpeedFrameTbl dynamically.
  3. We can avoid storing SpeedFrameTable for fully lit and fully dark tiles.

These optimizations were removed from DevilutionX in https://github.com/diasurgical/devilutionX/pull/361. This may explain the performance difference between DevilutionX and the original on older CPUs (such as Pentium on Windows 98 that we've been measuring). When the optimizations were removed in DevX, the FPS only dropped from 90 to 76 on @AJenbo's Raspberry Pi (https://github.com/diasurgical/devilutionX/pull/361#issuecomment-544253749) but the impact could be much larger on a Pentium.

AJenbo commented 1 year ago

Performance have improved over time as well:

DevilutionX 1.0.0: 830fps DevilutionX 1.1.0: 102fps DevilutionX 1.2.0: 770fps DevilutionX 1.3.0: 920fps DevilutionX 1.4.0: 920fps DevilutionX 1.4.1: 1230fps DevilutionX 1.5.0: 1320fps DevilutionX 1.5.1: 1320fps DevilutionX 1.6.0-dev: 1320fps

Of cause there are a lot of variables between each version so it's hard to say exactly what is the reason for the differences, but at least they are going in the right direction :)

Also these numbers are not comparable with the other number i have posted as it's a different level, 1920x480, Linux, windows.

361 got merged for 1.0.0

glebm commented 1 year ago

Baked lighting resulted in a performance drop, likely because baked lights data doesn't fit in the CPU cache. Closing.