fabiangreffrath / crispy-doom

Crispy Doom is a limit-removing enhanced-resolution Doom source port based on Chocolate Doom.
https://fabiangreffrath.github.io/crispy-homepage
GNU General Public License v2.0
810 stars 131 forks source link

Crispy Doom crashes after long periods of playing #85

Closed plumsinus closed 7 years ago

plumsinus commented 8 years ago

Finally posting this here, sorry I haven't really got to testing it any more yet. Here's what I know:

Is there a way to play demos faster in Crispy Doom, like you can in PrBoom+? My thoughts are that I would record a demo in PrB+ on a complex level, and is long enough to reliably crash in Crispy, as a test case. It would be nice if it didn't take 45 mins to test each time though.

fabiangreffrath commented 8 years ago

Thank you for taking the time to report this!

Is there a way to play demos faster in Crispy Doom, like you can in PrBoom+?

If you are going to report a longer multi-level demo, you can use the "-warp" parameter together with "-playdemo" in Crispy to fast-forward the demo up to a specific map. However, I doubt that a demo recorded that way (i.e. ended by a crash) and played back that way will help to reproduce the actual crash at all. :/

plumsinus commented 8 years ago

OK, here's a -longtics demo recorded with PrBoom+ that crashes Crispy Doom for me after about 30s, consistently at the same point. http://www.mediafire.com/download/ogdgmd6lunx4l82/xarath15_plm04.lmp

For MAP04 of this wad: http://www.mediafire.com/download/8780cdwi653vi7i/xarath15.zip

I can play the map for much longer than that when not playing back a demo (20 minutes?), but it does crash on me eventually. As far as I can tell the map does nothing weird that should cause a crash like that - missing sidedefs, broken nodes, etc. So I don't think I'm conflating multiple problems.

stderr.txt is unhelpful in both cases: I_SDL_InitSound: use_libsamplerate=1, but libsamplerate support not compiled in. P_SetupLevel: map04 (maps/xarath15.wad), Skill Hard, Time 0:00, Doom map format, normal BSP nodes.

fabiangreffrath commented 8 years ago

Hey, thank you very much for the WAD and the demo. You see the crash when that one chaingunner shoots at you, right? It seems that the demo status bar at the bottom is drawn beyond the limits of the frame buffer. See how fast it expands to full length even at the beginning of the demo? I have to investigate that!

fabiangreffrath commented 8 years ago

I think I got it: You recorded a demo with long tics and one byte of one of these has been identified as an early DEMOMARKER. Thus, the length of the demo lump has been miscalculated and that progress bar exceeded beyond the frame buffer.

plumsinus commented 8 years ago

I'll test when the new nightly build is out, but I'm not sure that's the fix for the original problem. Crashes occur even when not recording demos, the demo was just a way to try to have a reproducible error, instead of playing through a map and "hoping" for a crash.

In the demo, the crash occurs just as I shoot this sergeant: doom00

fabiangreffrath commented 8 years ago

OK, it crashed for me at a different place. Anyway, see how the demo progress bar at the bottom is already full?

But you are right, this bug is only triggered when playing back demos. It does not explain the random crashes you experienced during playing. :/

plumsinus commented 8 years ago

Well, the demo plays back without crashing, so the progress bar bug is fixed. Still need to investigate the crashes further though.

Danfun64 commented 8 years ago

Does Chocolate Doom have this issue?

plumsinus commented 8 years ago

So far the only levels I have had crashes in will not work in Chocolate Doom because the level geometry is too complex.

I have tried more testing but I still don't have much to go on. I think it is somehow related to saving and loading, since I have played a few crash-prone maps using IDDQD and never saving and did not get a crash. I played NUTS.wad with lots of saves and didn't get a crash then either, so I don't think it's an issue with lots of things. But I still have no very reliable way of causing the crash, and nothing quick either, so there isn't much to go on.

If anyone else wants to have a try, MAP26 of NOVA 2 (https://www.doomworld.com/idgames/levels/doom2/Ports/megawads/nova2) seems to crash fairly often, particularly in the large arena after the fake exit.

plumsinus commented 8 years ago

OK, crashes during MAP29 of D2Reload (https://www.doomworld.com/idgames/levels/doom2/megawads/d2reload). This is supposed to be vanilla-compatible, so I should be able to test it in Chocolate Doom.

fabiangreffrath commented 8 years ago

You don't happen to be playing on Nightmare or with -fast, do you?

plumsinus commented 8 years ago

Nope.

plumsinus commented 8 years ago

Latest update: I played a 1 hr demo of D2RELOAD MAP29 in Crispy Doom, and it worked fine, so I'm quite confident the problem has something to do with saving or reloading.

Chocolate Doom didn't crash in that map, with saving, using the latest nightly build.

Then I turned all the Crispness features off and played in low resolution and it didn't crash Crispy Doom either. So it seems it's some Crispy Doom addition that is causing the problem, and I can probably identify which one, eventually.

fabiangreffrath commented 8 years ago

Hm, and stdout/stderr don't reveal anything useful in all these crashes?

plumsinus commented 8 years ago

After upgrading to last night's autobuild from http://latest.chocolate-doom.org/, I am unable to reproduce this bug. Prior to that I was using an autobuild from before the 70 FPS option was available.

Crispy Doom 3.2 will still crash for me, but only with uncapped framerate enabled. (The current nightly I tested did not crash even with this on. Nonetheless I have started using 70 FPS mode.)

Magic solutions are never ideal, but hopefully whatever caused the problem will not come up again in the future. If it does then I will go about trying to build it locally and running through a debugger. For now, I am optimistically closing this bug.

plumsinus commented 8 years ago

So of course as soon as you release Crispy 3.3 I start getting crashes again -_-

Do I need to do anything special when building Crispy Doom here to be able to run it through a debugger?

fabiangreffrath commented 8 years ago

Ha, that's Murphy's Law I guess!

If you are using autotools, make sure to run the configure script with the right CFLAGS set:

CFLAGS="-g -O0" ./configure

The first one enables creation of debug symbols, the second one disables optimization.

If you are using e.g. MSVC, sorry no idea!

plumsinus commented 8 years ago

OK, will do. I was using autotools before. Kind of a lot going on around here so I might not get to it for a few days.

fabiangreffrath commented 8 years ago

Any news on this?

plumsinus commented 8 years ago

Haven't got a crash since I unclosed this bug! So I'm not really sure what to say.

fabiangreffrath commented 8 years ago

Let's play safe and just leave it open for a while. ;)

fabiangreffrath commented 8 years ago

Have you ever since experienced crashes like this?

fabiangreffrath commented 7 years ago

So, this bug is already open for a year and hasn't seen any response since then. Any news?

JNechaevsky commented 7 years ago

I think I have some ideas of testing... I can't play all night long of course, but I can run Crispy for all night with screen recorder + OSD timer. Cases to test (probably will require few days or nights):

Fabian, I need your approval, should I try this?

fabiangreffrath commented 7 years ago

Fabian, I need your approval, should I try this?

You really don't need, you are free to do or not do whatever you wish. ;)

Honestly, this bug is very old and I haven't heard any complaints since then. I might have been some short-lived problem that was fixed by some seemingly unrelated commit. Anyway, it's never wrong to be sure! Of all the possible test cases you proposed, this one sounds like the most promising to me:

TNT nightmare speedrun, placed into PWAD as DEMO1, DEMO2and DEMO3 and loaded as usual PWAD, so it will be infinite demo sequence.

If you could run this for some hours overnight, I think that would make a pretty tough stress test. And, if you do, please check the amount of memory that the crispy-doom process allocates at the beginning and the end of the test run. Thank you very much already!

JNechaevsky commented 7 years ago

Nothing can't to do with myself, it's all because of five years of corporate emailing. Doh.

Recording setup is up and ready, but my idea failed. If Nightmare TNT speedrun (link) stored as three DEMO* lumps, it desyncs in the end of MAP02. Really have no idea why. I've also was trying to start as:

crispy-doom.exe -file demos.wad -longtics

...but still got same desync.

Additionally I have tried this:

crispy-doom.exe -playdemo demo.lmp demo.lmp demo.lmp demo.lmp

But it does not help either for my case - game simply quits after finishing first demo.

So... Am I doing something wrong? Or something must be done in the code for longtics+DEMO*? Just in case you would like to try to play packed demos in PWAD, here it is: https://yadi.sk/d/_bTnIidq3JxErq

JNechaevsky commented 7 years ago

Here's how this desync looks (video with speedup): https://youtu.be/aHXaMFFihsM While trying to investigate why this is happening at all, I have noticed very strange situation. There is a... Chance of NO desync, if new game was started/ended before demo sequence. It may works, it may not.

Readme for demo saying that it was recorded in Competition Doom v2.0.3.2, but AFAIR from the code, Crispy is friendly with CnDoom.

I really don't want to write any incompetent bushwa, but this is all I can understand here. Fabian, your turn, please.

fabiangreffrath commented 7 years ago

So, we have a desync in a TNT demo. Per se, that's a catastrophy, BUT:

Have you tried both -iwad tnt.wad and -iwad doom2.wad -file tnt.wad?

Does it desync if you run -playdemo demo.lmp or -timedemo demo.lmp?

If the answer is "no" to the latter question, I am pretty sure this is the notorious bug of the Revenant's rockets being homing or not depending on the value of gametic (which is different if the demo is started from the command line or as part of the in-game demo loop).

JNechaevsky commented 7 years ago

I was trying only first and third ways. Okay, checked again, here's the all results just to be 100% clear (demos.wad - pwad with three packed Nightmare TNT demos):

fabiangreffrath commented 7 years ago

crispy-doom.exe -iwad tnt.wad -playdemo 30tn8336.lmp Working fine, no desync on MAP02. crispy-doom.exe -iwad tnt.wad -timedemo 30tn8336.lmp Working fine, no desync on MAP02. Actually, no any desyncs in whole run.

Phew, thanks for checking! Then I am pretty sure the desync is due to the Revenant firing homing/non-homing missiles near the end. But I already have an idea how to fix this...

fabiangreffrath commented 7 years ago

Strange, it desyncs if I run crispy-doom -iwad tnt.wad -file demos.wad -timedemo demo1.

fabiangreffrath commented 7 years ago

crispy-doom.exe -iwad tnt.wad -timedemo 30tn8336.lmp

What version were you testing with? This desyncs for me in MAP03.

JNechaevsky commented 7 years ago

Crispy - yesterday's latest development build. But may it be TNT.WAD itself issue?

I haven't mentioned, but for first tries I was having desync in MAP10. After reading video description on YouTube (link), I have noticed this:

I don't use any modified WAD, for TNT I use the TNT.WAD, which you can now buy on Steam.

Immediately after this, I have reinstalled Steam version Final Doom and replaced my current TNT iwad (yeah, sizes was a little bit different, but I was sure that was possible aftermath of some of my resource editing), and that solves MAP10 desync problem.

Steam version TNT.WAD is 18,195,736 bytes. Hmm.. May this be related to final/altfinal stuff?

Today's dev build crispy-doom-20170609-win32.zip is also working fine with Steam version as crispy-doom.exe -iwad tnt.wad -timedemo 30tn8336.lmp

P.S. Please check your e-mail.

fabiangreffrath commented 7 years ago

Confirmed, even Choco desyncs with the "usual" Vanilla TNT.WAD and plays fine with the one from Steam. Not sure what desyncs I have seen in MAP03 today, though...

JNechaevsky commented 7 years ago

Good morning, and good news!

My test is done. Sadly, while recording I was having a two freezes of the screen record, but that's seems to be a bug of recording software, or simply external 2.5" HDD's case controller isn't strong enough for such a contiguous writing. But since Crispy was running from this HDD too, it's probably a bug of screen recorder. Any way, for next time I will try to use external 3.5" HDD.

Video of 9:35:17 time of demo sequence: https://youtu.be/zel-zHenh_0 Edit: YouTube rendering done. Just click at any place in the timeline to see that everything is working perfect.

It was playing at 30 fps with sound, but sound was not recorded for preventing huge size of video. Video file size is 3.6 GB only, it's MPEG-4 with high compression ratio.

Uptime

Screenshot taken just before the end of the record. Overall uptime is aprox. 16 hours, and no desyncs nor bugs were noticed in the end!

Memory consumption

Something requires an attention here. Despite of all of possible resources were precached while demo sequence, just before record ending I have noticed that Crispy still takes RAM. Surprisingly, it was not happened in intermission screen and loading levels, it was happening somewhere in the middle of the level. But what can it be? I can't imagine. It is not critical (I hope), but maybe we should investigate it deeper.

From another hand, consuming additional ±10 megabytes of RAM for 14 hours (let first two hours be called as "overall precaching") is nothing. Modern computers are not potatoes, and barely this requires attention at all.

fabiangreffrath commented 7 years ago

Wow, what a massive run! Thank you very much for taking so much time!

I think the results are mostly relaxing if it wasn't for this one:

START: 16,2 MB END: 46,1 MB

Since the default memory zone allocated at engine start-up is 32 MB, it would have to increase this at least once. Either that, or one of the out-of-zone memory allocations that Crispy performs leaks. Maybe even some stupid string handling, it's really hard to tell... You don't happen to still have the stdout.txt/stderr.txt files from this run lying around somewhere?

JNechaevsky commented 7 years ago

My pleasure!

Here it is, both std's, created on 13.06.2017 in 09:51: https://yadi.sk/d/zy_PeCyF3K5LTW Nothing unusual in there, though, but 0:00:00 of P_SetupLevel in stderr.txt is attracting attention.

Thinking more about memory leaking, I think I have a plan. I need to start, for example, Doom II demo sequence and run it 2-3 times. Next, I will need to start a record of game window + task manager, to see a moment when memory consumption is increasing. All optional features must be disabled, just like while TNT speedrun. Maybe this will give at least something.

JNechaevsky commented 7 years ago

Thinking even more... Maybe I was wrong. After checking sixth or seventh standard TNT demo sequences, I have noticed that memory consumption changes between aprox. 29 <> 31 MBs, but don't go higher. Mostly it happens on loading new level, and rarely while playing a level.

I've also tried custom map with 299 Megaspheres, trying to catch a moment of unobvious increasing of RAM - nothing. Everything is correct in different situations: pickup speed, hurting, ending game / starting game. The only thing that attracts attention: consumption was increasing while starting new game in another skill level, but again, whole speed run was in Nightmare.

fabiangreffrath commented 7 years ago

Despise this bug being open for nearly two years now, nobody has aver been able to reliably reproduce the crashes experienced by @plumsinus . I am thus closing this, feel free to reopen once something more reproducible happens.