ZDoom / Raze

Build engine port backed by GZDoom tech. Currently supports Duke Nukem 3D, Blood, Shadow Warrior, Redneck Rampage and Powerslave/Exhumed.
679 stars 59 forks source link

[BUG] [SW] Freezing when pressing fire #563

Closed dhwz closed 1 year ago

dhwz commented 2 years ago

Raze version

2a905e802621cc538b38e1c48c014f883d102f6c

Which game are you running with Raze?

Shadow Warrior

What Operating System are you using?

Other

If Other OS, please describe

351ELEC Linux/ARM64

Relevant hardware info

RK3326, Mali-G31

Have you checked that no other similar issue already exists?

A clear and concise description of what the bug is.

I've build Raze from the latest sources and Shadow Warrior always freezes after pressing the fire button once or twice (with sword as selected weapon). Happens also in Wanton Destruction and Twin Dragon.

Steps to reproduce the behaviour.

  1. Start a new game
  2. Press fire (with sword as weapon)
  3. Game freezes, Raze process can only be killed.

Your configuration

No response

Provide a Log

I did't get an error log, I really want to provide more information if you tell me how we can debug this.

mjr4077au commented 2 years ago

@dhwz I can't replicate this on Windows or Linux. Are you using Raze Touch? It might be best to report it to their team.

dhwz commented 2 years ago

@mjr4077au What is Raze Touch? I'm building Raze directly from your latest sources for our OS (https://github.com/351ELEC/351ELEC). That's why I asked how I could possibly debug it, I didn't get any log messages?

mjr4077au commented 2 years ago

@mjr4077au What is Raze Touch? I'm building Raze directly from your latest sources for our OS (https://github.com/351ELEC/351ELEC). That's why I asked how I could possibly debug it, I didn't get any log messages?

Raze Touch is here: http://opentouchgaming.com/raze-touch/. Apologies, I thought you were using this, I didn't know you had your own fork as well.

I know it's a shit answer but it does seem like something isolated to your fork. I can't trigger this from master on two different operating systems. Is it a full on freeze? No actual crash or anything?

dhwz commented 2 years ago

Yes it's just freezing the rendering, it's not crashing. I've to kill it to stop the process. Maybe it's related only to the GLES2 rendering. I've to try what @emileb was writing here (https://github.com/coelckers/gzdoom/issues/1485) maybe it will just luckily fix the issue.

emileb commented 2 years ago

I'll update Raze Touch and check if I see anything similar

mjr4077au commented 2 years ago

I'll update Raze Touch and check if I see anything similar

Just seeing if you saw any issues in your fork?

mjr4077au commented 2 years ago

@emileb and/or @dhwz, any updates from anyone regarding this one?

emileb commented 2 years ago

I updated Raze Touch and I didn't have the same problem. I guess it would be helpful if @dhwz updated to latest code and tried again?

dhwz commented 2 years ago

@emileb I've already updated to the latest commit a 2 days ago still same issue. Any recommendations how to debug? gdb won't help unless it really crashes?

dhwz commented 2 years ago

@mjr4077au @emileb don't know if that helps, but here is the backtrace after freezing and manually killing the process. Test was based on latest commit from yesterday 01deb13694caad2cff45b392a17e533240393e2f.

Thread 1 "raze" received signal SIGTRAP, Trace/breakpoint trap.
0x000000556a4c41f8 in ShadowWarrior::pSpawnSprite(ShadowWarrior::PLAYERstruct*, ShadowWarrior::PANEL_STATEstruct*, unsigned char, double, double) ()
(gdb) bt
#0  0x000000556a4c41f8 in ShadowWarrior::pSpawnSprite(ShadowWarrior::PLAYERstruct*, ShadowWarrior::PANEL_STATEstruct*, unsigned char, double, double) ()
#1  0x000000556a4c434c in ShadowWarrior::SpawnSwordBlur(ShadowWarrior::PANEL_SPRITEstruct*) ()
#2  0x000000556a4c44d4 in ShadowWarrior::pSwordSlide(ShadowWarrior::PANEL_SPRITEstruct*) ()
#3  0x000000556a569f40 in ShadowWarrior::domovethings() ()
#4  0x000000556a56a3b4 in ShadowWarrior::GameInterface::Ticker() ()
#5  0x000000556a0745bc in TryRunTics() ()
#6  0x000000556a07467c in MainLoop() ()
#7  0x000000556a07b094 in RunGame() ()
#8  0x000000556a07bef0 in GameMain() ()
#9  0x0000005569f90110 in main ()
(gdb)
mjr4077au commented 2 years ago

For testing, can you try setting cl_nomeleeblur to 0? We added this in over a year ago on some request and it would be perfect for testing this issue out. It's not a permanent fix but might get you out of the woods

dhwz commented 2 years ago

Nope, same result. cl_nomeleeblur was also already set to false in the gzdoom.ini

@mjr4077au Oh wait but the other way around seems to do the trick setting it to 1, I've to test a bit so it's not just coincidence.

Yes, great that did it! 👍

emileb commented 2 years ago

Interesting thanks for the backtrace, it's not anything to do with GLES. What compiler and versions are you using? Is it compiled with arm64 or 32bit?

dhwz commented 2 years ago

arm64 with gcc 10.3.0

emileb commented 2 years ago

OK thanks, I'm using Clang for compilation. I asked because Clang can produce some instructions which presume alignment of structures and 64bit members and I need to compile with "-arm-assume-misaligned-load-store" avoid some traps.

I think to debug this we need to get the exact line number the SIGTRAP occurs, if that is possible to get?

mjr4077au commented 2 years ago

Nope, same result. cl_nomeleeblur was also already set to false in the gzdoom.ini

@mjr4077au Oh wait but the other way around seems to do the trick setting it to 1, I've to test a bit so it's not just coincidence.

Yes, great that did it! 👍

Thanks it was midnight when I sent that and clearly my brain cells were lacking 😅

I'll let you come back with the info Emile has asked for and we'll see if we can get to a true fix.

mjr4077au commented 2 years ago

@dhwz just seeing whether you've been able to get the line that the SIGTRAP occurs on?

dhwz commented 2 years ago

@mjr4077au Yes took some time as I was busy. Here are the results of the gdb with debug build.

0x00000055868d4c60 in ShadowWarrior::InsertPanelSprite (pp=0x55882cb3c0 <ShadowWarrior::Player>, psp=0x55a5d82510) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:6442
6442    /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp: No such file or directory.
(gdb) bt
#0  0x00000055868d4c60 in ShadowWarrior::InsertPanelSprite (pp=0x55882cb3c0 <ShadowWarrior::Player>, psp=0x55a5d82510) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:6442
#1  0x00000055868d4d8c in ShadowWarrior::pSpawnSprite (pp=0x55882cb3c0 <ShadowWarrior::Player>, state=state@entry=0x0, priority=priority@entry=64 '@', x=302.703125, y=200) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:6465
#2  0x00000055868d4e9c in ShadowWarrior::SpawnSwordBlur (psp=0x55a02f3360) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:798
#3  0x00000055868d500c in ShadowWarrior::pSwordSlide (psp=0x55a02f3360) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:1035
#4  0x00000055869801bc in ShadowWarrior::pSpriteControl (pp=<optimized out>) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:6887
#5  ShadowWarrior::domovethings () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/player.cpp:7199
#6  0x0000005586980664 in ShadowWarrior::GameInterface::Ticker (this=<optimized out>) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/game.cpp:606
#7  0x0000005586455a44 in GameTicker () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/core/mainloop.cpp:367
#8  TryRunTics () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/core/mainloop.cpp:652
#9  0x0000005586455b0c in MainLoop () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/core/mainloop.cpp:710
#10 0x000000558645cddc in RunGame () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/core/gamecontrol.cpp:1090
#11 0x000000558645d930 in GameMain () at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/core/gamecontrol.cpp:569
#12 0x00000055863749b0 in main (argc=17, argv=0x7ff2bfc628) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/common/platform/posix/sdl/i_main.cpp:194
(gdb) frame
#0  0x00000055868d4c60 in ShadowWarrior::InsertPanelSprite (pp=0x55882cb3c0 <ShadowWarrior::Player>, psp=0x55a5d82510) at /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp:6442
6442    in /home/dragon/351ELEC/build.351ELEC-RG351MP.aarch64/raze-47001e6bab0229561109ff74fa65802863509fab/source/games/sw/src/panel.cpp
mjr4077au commented 2 years ago

Thanks for this. I'm personally not too sure what's going on here. It's possible the way SW is allocating RAM for the PANEL_SPRITEp typedef is at odds with your system, but its otherwise working code. I'll discuss with Graf but I appreciate you sending in that stacktrace.

coelckers commented 2 years ago

At this point all I can suggest is to add debug output to InsertPanelSprite to see where it hangs inside that function. This function works on linked lists that do type punning in a questiinable manner and those are not the most stable constructs out there. It may be that some compilers cannot deal with such code, but we need more info.

emileb commented 2 years ago

Yes I can not see what would cause this from the code. I can try and see if I can use GCC to build for ARM but it won't be the same version, Google have switched to Clang now but I may be able to get an older GCC version which hopefully has the same issue if it is compiler related

mjr4077au commented 2 years ago

Just seeing if there's any updates from anyone's end on this? There's a branch called develop which has some work that's not ready for master yet, but there's been a tonne of changes from array index accesses to pointers. It could, by chance, happen to be resolved there. Might be worth testing out?