viti95 / FastDoom

Doom port for DOS, optimized to be as fast as possible!
512 stars 33 forks source link

Add uncapped FPS option and interpolate animation state between gametics #176

Closed dougvj closed 3 weeks ago

dougvj commented 7 months ago

Changes needed to accomplish this:

This saves the following values between two gametics:

Values are only saved when uncappedFPS is enabled.

Known bugs:

This will allow work to proceed to handle #140 with custom refresh rates that better accommodate modern displays

In addition, in my previous PR, I had intended to include dbgcfg.h which was a file for setting debug output options. I include it here in this PR.

@viti95 Happy to address any questions or issues

dougvj commented 7 months ago

OK I had a few last minute changes, but I think it's in good shape now.

viti95 commented 7 months ago

Looks really promising, it's a big change so I'll test it in detail and post here any additional issues. I wonder if using a faster timer interrupt will be slower on 386/486 cpu's.

dougvj commented 7 months ago

Yeah needs some testing, the lost cycles could be impactful.

We should probably change the timer interrupt when the option changes since the high resolution is only needed for frame interpolation.

viti95 commented 7 months ago

Here is a quick review, I've found the following issues:

I'll upload some videos to show some of this issues better. Also I still have to try this on any 386/486, and see if performance is the same.

dougvj commented 7 months ago

Thanks for the feedback, I will probably work on this on and off this week and will keep you posted.

I admit I totally forgot to check timedemos, and I know the interpolation logic is incomplete there. I will work on that.

There are a couple of issues I don't know how to solve, mostly this one:

Mode Y and VESA Direct modes have tearing issues (2D elements) if framerate > native display refresh rate. This error also happened with the previous fake uncapped framerate method.

This is caused by a frame being presented mid render before the 2d elements are drawn. I think the only solution is very complex rendering logic that avoids drawing over 2d elements on the previous frame. Probably not worth it, but it may be interesting to add an option to cap the framerate at a particular value.

I am not also unclear how the AI works or would be affected by this since in theory gametics and the position values are the same, but I could also have overlooked something.

dougvj commented 7 months ago

Re:

Demo desyncs (capped and uncapped, this is an important one)

This appears to exist in master, if I run Demo 1 then start a game, the game is desyncd. Is this what you're referring to?

dougvj commented 7 months ago

VSync seems to be working but framerate is not as fluid as it should be

I am getting frame pacing issues when set to 70Hz, If I force a refresh rate in dosbox-x to my monitor's native display it looks pretty good. Maybe you're experiencing something similar?

I will try it out on real hardware with a real CRT later and confirm it looks smooth.

viti95 commented 7 months ago

For now all the testing I've done is natively in my HP Compaq NX7300, it's a Core2 Duo so a bit overpowered. Desync happens with -timedemo command line parameter, and also in-game demos. Yet I have to try the in-game benchmark tool. I'll check this in detail, for now I've just compared this branch with master.

The VSync issue also happens in this laptop, maybe because the screen has a 60hz refresh rate. For example, FDOOM13H runs silky smooth without VSync at ~300 fps, when VSync is enabled the framerate drops to 60fps as expected but there are lots of frame pacing stutter. Same happens on VESA modes.

As for the Mode Y / Direct mode 2D issues, yep I also think that is not easy to fix. With VSync enabled the issue dissapears, maybe we can skip this problem just as a technical limitation.

dougvj commented 7 months ago

OK yeah I did some more deep dive on this and I think I figured out the issue. In TryRunTics we try to render interpolated frames but in doing so we overshoot the gametic interval. The next set then only renders, say, one frame and then issues a gametic. These frames are occasionally aliased to the same interpolation interval. This aliasing isn't noticeable at high frame rates

I rewrote the logic that handles this so that we don't rebase the interpolation weight every gametic and also handle some of the timing in D_Display() (since it knows when the interpolation is overshot). It appears to be smoother for me. Want to check it out?

In addition, I also dynamically set the interval timer based on whether uncapped mode is enabled. I don't think the higher resolution timer has much of a performance impact even on older machines, but this way we can say for sure that there isn't going to be one.

I also investigated the timedemo desync, which for some reason I didn't notice when running from the menu. Turns out I can't modify the structs sector_t and mobj_t because their size and layout is fixed for the WADs. I will have to malloc the interpolation values separately.

dougvj commented 7 months ago

OK actually I was wrong, the size and layout are not fixed they are loaded separately as I had previously assumed, but something about the size of these structures modifies the AI behavior which desyncs the demos

viti95 commented 7 months ago

Maybe there is some issue in P_CheckSight (p_sight.c), I did some hand made optimizations because the compiler did a very bad job optimizing the code (lines 265-269). If sector_t size changes then this code stops working properly. Try to revert these lines and check if it behaves properly.

Also I can test any change you're testing, I will be a bit busy these days but will try to have some spare time for this.

viti95 commented 7 months ago

I've tested the new D_Display code and wow, now it's really smooth. Also VSync problems have been fixed. Good job!

viti95 commented 7 months ago

Also in-game demo playback stuttering seems fixed

dougvj commented 7 months ago

Ah ha! Your hint was very helpful. I had actually thought to grep for hardcoded sizeof(sector_t) values but didn't make the connection with the Div. I decided to go ahead and pad the sector_t to 128 bytes for fast division, this does waste some memory. I will leave it up to you to decide if you want to leave it in or put in a more memory optimized division.

I was not satisfied with the rendering logic so of course I spent all evening rewriting it. My big concern was the interaction with the capped mode, so I separated the logic of the two. This means we have two TryRunTicks and so forth, but no longer handle interpolation logic in D_Display. It should be almost the same, can you confirm it still looks smooth to you?

One of the points of confusion to me is the input buffer that is filled by NetUpdate, with my refactor I had trouble with the buffer falling behind the game. For now, I solved it by just polling the input on every gametic, I can look into using the same NetUpdate logic in the future.

There is a known bug/regression, the start viewz position during the wipe effect is not correct.

I also benchmarked the impact of uncapped vs capped (that is, with the frame interpolation logic enabled) and I got ~175fps with uncapped vs ~178 "capped" with fdoomvbr.

I also found the issue with the elevator/door rendering, the visiplanes needed to have their height values interpolated, so doors and lifts should be smooth now.

viti95 commented 7 months ago

I've tested latest updates, I can confirm elevators/lifts now are fine. Framerate is stable and frame pacing seems fine. Also demos now play correctly, which is great. The downside is that now mouse movement is not working fine with uncapped framerate (like the movement is skipped). There is another small bug I found, now when the level starts the player is located at Z position 0 for one frame. Damn is late and I missed you put this on the comment 😅

viti95 commented 7 months ago

About the sector_t size, don't know really how much memory is allocated for sectors per level. We can increase the memory usage if the game runs faster and it remains playable on a 386 with 4Mb of RAM (without crashes). I've been trying to change how memory is allocated, to make better use of the zone memory and use all memory available, but is not finished yet.

dougvj commented 7 months ago

Thanks again for testing. I will fix the viewz problem. The mouse input issue isn't surprising, I need to rework the input buffer handling to better match the capped case. I will keep working at it on and off and let you know when I have an update

dougvj commented 7 months ago

OK I think I got the start viewz and the input buffer handling worked out. Would you like to test some more?

viti95 commented 7 months ago

Yesterday I had a bit of time to try but the code wouldn't compile:

file r_main.obj(/home/viti95/fastdoom/FASTDOOM/r_main.c): undefined symbol I_Printf_

dougvj commented 7 months ago

Whoops left some debug statements in, thought I had gotten all of those. Should be good now

viti95 commented 7 months ago

I've done some testing on my 486, I think 35 fps framerate mode is now broken, the movement is very slow when the framerate is lower than 35 fps. Feels like the game runs in slow motion.

Also on uncapped mode the frames jump like back and forth when framerate is lower than 35 fps.

dougvj commented 7 months ago

Hmm I must have a regression in the < 35 fps case I'll investigate

dougvj commented 7 months ago

OK so the problem seems to be that the maketic and the gametics get out of sync in the uncapped case and my attempts to fix that broke the < 35 fps where the maketic and gametics are supposed to be out of sync since the gametics can't keep up. I'll give it some thought and try to come up with a fix.

Thansk for testing, I need to get my 486 fired up again

dougvj commented 4 months ago

Sorry been really busy the past few months, I will try to get back to this I feel like I am really close

viti95 commented 4 months ago

Don't worry, I've also been quite busy lately. And yeah I also think this is very very close to be fully functional.

dougvj commented 1 month ago

I rebased against master with some fixes, it seems to work fine under emulation. I want to spend some time testing with real hardware

viti95 commented 1 month ago

I've done some testing on dosbox-x, but I'll try on real hardware a bit later today to check everything is ok. For now I've found that the new implementation I made for FPS calculation (average of last 16 entries, a bit faster compared to the older method) don't work right with the uncapped FPS (it maxes out at 35fps).

@tigrouind has implemented better methods https://github.com/viti95/FastDoom/discussions/204 but are still not merged on the master tree, maybe it's a good idea to test them on this PR

EDIT: I forgot to say that there are two includes that are causing troubles to build without debug (include "i_log.h"):

g_game.c: line 71 m_bench.c: line 58

EDIT 2: I forgot to mention that the FPS calculation wrong is with VSync enabled, without VSync it's fine

dougvj commented 1 month ago

Thanks for the feedback, I totally forgot to test release, derp.

I will look into the FPS counter and also maybe try to rework the tick rate logic to avoid hitting the task structure at 560Hz

viti95 commented 1 month ago

I'm thinking of doing a test release so we can get feedback from users. I'm pretty sure people on Vogons and DoomWorld would want to test this new feature.

I've done some testing on both my Intel 386SX@33 and Cyrix 486@80, as well as on a Core2 Duo laptop. On the 386SX, it's basically unusable (which was totally expected). However, the 486 can manage an average of 60 fps on low detail, and it feels really smooth. I still need to test it on faster machines like the Pentium or K5. On the Core2, I'm able to get over 60 fps at 1280x800, which is truly amazing.

viti95 commented 1 month ago

Some benchmarking on my 486, Ultimate Doom, DEMO3, screen size 9:

High detail: Base: 50.057 fps This branch: 49.780 fps (-0.5%)

Potato detail: Base: 126.953 fps This branch: 124.843 fps (-1.6%)

Basically the loss of performance is on the margin of error

dougvj commented 1 month ago

Thanks for taking care of that, my week has been hectic.

viti95 commented 4 weeks ago

Some bugs I've found in uncapped mode: