MrAlaux / Nugget-Doom

Nugget Doom is a fork of Woof! with additional features.
GNU General Public License v2.0
60 stars 3 forks source link

(Actually merged) Support non-power-of-2 resolution multipliers #77

Closed MrAlaux closed 9 months ago

MrAlaux commented 10 months ago

No plans to merge this yet, just want to generate some autobuilds.

liPillON commented 10 months ago

oh this is gonna be good :)

FYI, when I asked for 3x feasibility in Woof a while ago, Fabian warned about possible issues

MrAlaux commented 10 months ago

FYI, when I asked for 3x feasibility in Woof a while ago, Fabian warned about possible issues

Well, I was possibly going to ask you to try this out later, but if you'd like to do so now and check if stuff like that SSG issue are present, feel free.

liPillON commented 10 months ago

I tested (quickly) various permutations of the resolution multiplier, smooth pixel scaling, view/weapon bobbing, blocky fuzz, swirling liquids.

TBH I did not notice any issue with sprite rendering. The only thing that I would consider a rendering bug is this thin line:

Details ![nugg0000](https://github.com/MrAlaux/Nugget-Doom/assets/48625273/511d892c-f07c-4cb7-975a-9fa649f9175f)

It shows up only with odd multipliers and -of course- is noticeable only with freelook enabled and "short skies streching" disabled, so this might not be that big of a deal.

MrAlaux commented 10 months ago

Sounds like #60.

MrAlaux commented 10 months ago

By the way, could you run some performance tests to compare between master and this branch? For this to work, many bit shifts had to be changed to multiplications or divisions, which in theory aren't as fast, but from a quick test it doesn't seem to have made a difference.

liPillON commented 10 months ago

Sounds like #60.

I guess? I'm not using custom fov values, though... And the line is not there when using 2x/4x/6x/8x renderer multipliers.

By the way, could you run some performance tests to compare between master and this branch? For this to work, many bit shifts had to be changed to multiplications or divisions, which in theory aren't as fast, but from a quick test it doesn't seem to have made a difference.

I'll see what I can do in the next few days.. I'll probably compare the latest master build vs this PR's artifacts, using resolutions available in both (400p, 800p, 1600p)

MrAlaux commented 10 months ago

For the record, I just took a look at weapon sprites with uneven multipliers, and I noticed that they're slightly off. Not to the point of out-of-place strips of sprites being drawn, but the base sprite is aligned a bit differently than with even multipliers and the flash sprite is just a bit misaligned.

liPillON commented 10 months ago

The first results are in! Interestingly, the higher the resolution, the smaller the performance gap between builds is.

EVITERNITY

https://dsdarchive.com/files/demos/eviternity/44536/evit26-658.zip

master pr77
400p 315.5 309.3
600p n/a 238.1
800p 168.5 167.0
1000p n/a 137.3
1200p n/a 104.2
1600p 54.5 54.0

Test methodology:

Laptop specs:

liPillON commented 10 months ago

Out of curiosity, I also benchmarked DSDA 0.27.3 using the same demo. At 1280x800, the software renderer scores 199fps and the opengl renderer scores 88fps.

MrAlaux commented 10 months ago

The first results are in!

Well, that's an unwanted performance loss, but I'd hope that the finer control over resolution justifies it. Think you could try some of the new settings? Particularly 3X and 5X.

Out of curiosity, I also benchmarked DSDA 0.27.3 using the same demo. At 1280x800, the software renderer scores 199fps and the opengl renderer scores 88fps.

DSDA's Software turned out faster than Hardware? Is that right?

liPillON commented 10 months ago

Well, that's an unwanted performance loss, but I'd hope that the finer control over resolution justifies it. Think you could try some of the new settings? Particularly 3X and 5X.

Done, table updated above.

DSDA's Software turned out faster than Hardware? Is that right?

Yeah, confirmed. On this map and with my hardware, at least. I guess it has to do with geometry culling? The starting position in MAP26 in particular let the player see pretty far in the distance, and there the performance gap between software and opengl is really noticeable. It happens also in GZDoom (i use the GLES backend on this hardware), with software and hardware renderer scoring pretty similarly to DSDA.

liPillON commented 10 months ago

I ran another round of test, using a different demo. Note that I've used the same build as before, so the most recent commits in this PR do not affect the results.

DEMONASTERY

https://dsdarchive.com/files/demos/demonastery/48438/demonastery-1326.zip

master pr77 dsda sw dsda hw
400p 482,2 481,2 517,0 89,3
800p 194,9 192,9 199,1 89,3
1600p 60,1 56,6 89,8 89,3

The results are in line with previous benchmarks.

I included dsda opengl results just for reference but I won't run any more timedemos on it, it's clear that something fishy is going on and this is not the place to investigate on it.

liPillON commented 10 months ago

ALIEN VENDETTA

https://dsdarchive.com/files/demos/av/28588/av20-1357.zip

master pr77 dsda sw
400p 729,4 693,4 821,1
800p 213,9 210,6 312,8
1600p 56,3 56,3 97,1

This will be the last round of timedemos, as I won't have much spare time in the coming days.

MrAlaux commented 10 months ago

@liPillON, think you could run some tests on this latest commit whenever you have some time to spare? Don't worry, I understand that you're busy ATM; take your time.

liPillON commented 10 months ago

Here we go... Once again, I tested only common resolutions between builds. Let me know if you are interested in any particular multiplier available thanks to this PR.

TESTED BUILDS

ALIEN VENDETTA

https://dsdarchive.com/files/demos/av/28588/av20-1357.zip

v.res master pr77
2x 400p 580.5 580.6
3x 600p n/a 365.6
4x 800p 227.3 225.9
5x 1000p n/a 170.5
6x 1200p n/a 120.7
7x 1400p n/a 98.8
8x 1600p 59.9 59.8
9x 1800p n/a 62.0

DEMONASTERY

https://dsdarchive.com/files/demos/demonastery/48438/demonastery-1326.zip

v.res master pr77
2x 400p 444.1 437.4
3x 600p n/a 312.7
4x 800p 205.4 202.1
5x 1000p n/a 159.2
6x 1200p n/a 119.3
7x 1400p n/a 96.3
8x 1600p 60.2 60.1
9x 1800p n/a 62.4

EVITERNITY

https://dsdarchive.com/files/demos/eviternity/44536/evit26-658.zip

v.res master pr77
2x 400p 317.7 318.5
3x 600p n/a 245.1
4x 800p 170.6 169.4
5x 1000p n/a 138.1
6x 1200p n/a 104.8
7x 1400p n/a 87.0
8x 1600p 55.9 55.5
9x 1800p n/a 57.5
liPillON commented 10 months ago

I ended up leaving my machine running through the night, benchmarking all the remaining resolutions. The tables in the previous comment have been updated. I can't explain why 9x performed better than 8x.. Thermals maybe?

MrAlaux commented 10 months ago

What caught my eye were some inconsistencies between master branch benchmarks from the batches before and after 9X resolution support: the biggest discrepancy I can observe is that the Alien Vendetta 400p benchmark declared 729,4 FPS in the first batch and 580,5 FPS in the later batch, but the results are varied and not always in favor of the first batch. In any case, they're still just tests run on master.

Anyways, all batches combined seem to indicate an overall minimal decrease in performance between master and this branch. It's an excusable sacrifice IMO, so I guess it's just a matter of testing the branch a bit more to ensure it didn't introduce any more bugs before I merge it.

liPillON commented 10 months ago

Yeah sorry I forgot to mention that the previous 400p tests were performed with widescreen OFF to make the results comparable with dsda (where resolutions below 480p are rendered with a stretched 4:3 aspect ratio)

So we should discard the older 400p results, since the engine had to render less pixels.

liPillON commented 10 months ago

In general I noticed that, the lower the resolution and the "simpler" the map geometry is, the wider the gap between worst/best scores becomes (eg: the latest AV.WAD at 400p benchmarks for the master branch fluctuated between 547.7 and 644.5 fps)

MrAlaux commented 10 months ago

In general I noticed that, the lower the resolution and the "simpler" the map geometry is, the wider the gap between worst/best scores becomes (eg: the latest AV.WAD at 400p benchmarks for the master branch fluctuated between 547.7 and 644.5 fps)

So, as the game requires less and less processing power, it's more likely that its performance will skyrocket occasionally?

liPillON commented 10 months ago

Yeah this is what I observed on my machine.

MrAlaux commented 10 months ago

Alright, got it.

Anyways, thanks for the benchmarks!

liPillON commented 10 months ago

In general I noticed that, the lower the resolution and the "simpler" the map geometry is, the wider the gap between worst/best scores becomes (eg: the latest AV.WAD at 400p benchmarks for the master branch fluctuated between 547.7 and 644.5 fps)

So, as the game requires less and less processing power, it's more likely that its performance will skyrocket occasionally?

for reference, here's the full dataset for the master branch's results:

400p avg run1 run2 run3 run4 run5
AV 580,5 547,7 644,5 562,4 599,8 579,2
DMNSTRY 444,1 450,5 447,5 442,9 441,8 434,3
EVTRNTY 317,7 321,1 318,2 317,2 317,6 315,1
800p avg run1 run2 run3 run4 run5
AV 227,3 226,6 226,7 227,1 229,3 228,1
DMNSTRY 205,4 204,7 206,8 205,0 206,4 203,9
EVTRNTY 170,6 170,8 170,6 170,3 171,6 168,4
1600p avg run1 run2 run3 run4 run5
AV 59,9 60,4 59,2 59,9 60,8 59,5
DMNSTRY 60,2 60,3 59,9 60,8 60,3 60,0
EVTRNTY 55,9 56,1 56,2 55,5 56,0 55,4
MrAlaux commented 9 months ago

Merged from GitHub Desktop: https://github.com/MrAlaux/Nugget-Doom/commit/1e16a58b7df287cdf825e4432c882e87fd7f775d