dgcor / DGEngine

An implementation of the Diablo 1 game engine
Other
243 stars 30 forks source link

Crash when walking in town #16

Closed mewmew closed 6 years ago

mewmew commented 6 years ago

Compiled the latest revision 6268b901739e3d389acfa92a6a346c2b6291cbba and that worked well. Tried running the game, and it works perfectly in the menus, plays sounds, and loads the game world. However, after a few steps in any direction, either walking up towards Griswold or down, results in a crash.

Let me know if there is some further information that can assist troubleshooting.

Happy to see the TMX additions in this latest version! Would love to play around with them a bit.

Cheers! /u

u@x220 ~/D/d/DGEngine> coredumpctl gdb 27406
           PID: 27406 (DGEngine)
           UID: 1000 (u)
           GID: 100 (users)
        Signal: 11 (SEGV)
     Timestamp: Sun 2018-01-07 23:16:25 CET (32s ago)
  Command Line: ./DGEngine ./gamefilesd
    Executable: /home/u/Desktop/diablo/DGEngine/DGEngine
 Control Group: /user.slice/user-1000.slice/session-c1.scope
          Unit: session-c1.scope
         Slice: user-1000.slice
       Session: c1
     Owner UID: 1000 (u)
       Boot ID: b470d7a5add24cfba5fc1629e3eae483
    Machine ID: 28a33732c4064f1791505da65cf095d5
      Hostname: x220
       Storage: /var/lib/systemd/coredump/core.DGEngine.1000.b470d7a5add24cfba5fc1629e3eae483.27406.1515363385000000.lz4
       Message: Process 27406 (DGEngine) of user 1000 dumped core.

                Stack trace of thread 27406:
                #0  0x0000561d206cdf7d n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #1  0x0000561d206da208 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #2  0x0000561d206da37d n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #3  0x0000561d206da57c n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #4  0x0000561d206d562a n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #5  0x0000561d206d48bd n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #6  0x0000561d2070cee4 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #7  0x0000561d20798fc9 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #8  0x0000561d2079968e n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #9  0x0000561d206519e4 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #10 0x0000561d2064ebae n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #11 0x0000561d2064db9f n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #12 0x0000561d2062e097 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #13 0x00007fccc0d48f4a __libc_start_main (libc.so.6)
                #14 0x0000561d2062dbda n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)

GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/u/Desktop/diablo/DGEngine/DGEngine...(no debugging symbols found)...done.
[New LWP 27406]
[New LWP 27412]
[New LWP 27413]
[New LWP 27459]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `./DGEngine ./gamefilesd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000561d206cdf7d in LevelCell::Passable() const ()
[Current thread is 1 (Thread 0x7fccc5160880 (LWP 27406))]
(gdb) bt
#0  0x0000561d206cdf7d in LevelCell::Passable() const ()
#1  0x0000561d206da208 in MapSearchNode::IsPassable(short, short) const ()
#2  0x0000561d206da37d in MapSearchNode::addSuccessor(AStarSearch<MapSearchNode>*, short, short, short, short) ()
#3  0x0000561d206da57c in MapSearchNode::GetSuccessors(AStarSearch<MapSearchNode>*, MapSearchNode*) ()
#4  0x0000561d206d562a in AStarSearch<MapSearchNode>::SearchStep() ()
#5  0x0000561d206d48bd in LevelMap::getPath(PairXY<unsigned short> const&, PairXY<unsigned short> const&) const ()
#6  0x0000561d2070cee4 in ActPlayerMoveToClick::execute(Game&) ()
#7  0x0000561d20798fc9 in ActIfCondition::ifCondition(Game&, bool) ()
#8  0x0000561d2079968e in ActIfCondition::execute(Game&) ()
#9  0x0000561d206519e4 in EventManager::update(Game&) ()
#10 0x0000561d2064ebae in Game::updateEvents() ()
#11 0x0000561d2064db9f in Game::play() ()
#12 0x0000561d2062e097 in main ()
ghost commented 6 years ago

I committed a fix now.

You can use this guide to test TMX maps: wiki

if you want to load the town level using a TMX json file, replace the following lines in level/town/level.json:

    "map": [
      { "file": "levels/towndata/sector1s.dun", "position": [46, 46] },
      { "file": "levels/towndata/sector2s.dun", "position": [46, 0] },
      { "file": "levels/towndata/sector3s.dun", "position": [0, 46] },
      { "file": "levels/towndata/sector4s.dun", "position": [0, 0] }
    ],

with

    "map": { "file": "level/town/town.json", "position": [0, 0] },

and copy the file in the provided zip file town.json to level/town/

mewmew commented 6 years ago

Thanks for the quick response!

The walk issue seem to have been corrected, but now I'm experiencing freezes at different times (normally after one minute or so), and the player animation went transparent (became not visible) for a while, and then came back, and one second after it became visible again the game froze for a while. A bit later in the game, it crashed with the following back trace:

u@x220 ~/D/d/DGEngine> coredumpctl gdb 5409
           PID: 5409 (DGEngine)
           UID: 1000 (u)
           GID: 100 (users)
        Signal: 11 (SEGV)
     Timestamp: Tue 2018-01-09 13:07:25 CET (8min ago)
  Command Line: ./DGEngine ./gamefilesd
    Executable: /home/u/Desktop/diablo/DGEngine/DGEngine
 Control Group: /user.slice/user-1000.slice/session-c1.scope
          Unit: session-c1.scope
         Slice: user-1000.slice
       Session: c1
     Owner UID: 1000 (u)
       Boot ID: 9616d05774594cb8a3727eda67cb7e51
    Machine ID: 28a33732c4064f1791505da65cf095d5
      Hostname: x220
       Storage: /var/lib/systemd/coredump/core.DGEngine.1000.9616d05774594cb8a3727eda67cb7e51.5409.1515499645000000.lz4
       Message: Process 5409 (DGEngine) of user 1000 dumped core.

                Stack trace of thread 5409:
                #0  0x0000559c33e9af7d n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #1  0x0000559c33ea7208 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #2  0x0000559c33ea737d n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #3  0x0000559c33ea757c n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #4  0x0000559c33ea262a n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #5  0x0000559c33ea18bd n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #6  0x0000559c33ed9ee4 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #7  0x0000559c33f65fc9 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #8  0x0000559c33f6668e n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #9  0x0000559c33e1e9e4 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #10 0x0000559c33e1bbae n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #11 0x0000559c33e1ab9f n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #12 0x0000559c33dfb097 n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)
                #13 0x00007f0a90770f4a __libc_start_main (libc.so.6)
                #14 0x0000559c33dfabda n/a (/home/u/Desktop/diablo/DGEngine/DGEngine)

GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/u/Desktop/diablo/DGEngine/DGEngine...(no debugging symbols found)...done.
[New LWP 5409]
[New LWP 5460]
[New LWP 5416]
[New LWP 5415]
Core was generated by `./DGEngine ./gamefilesd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000559c33e9af7d in ?? ()
[Current thread is 1 (LWP 5409)]
(gdb) bt
#0  0x0000559c33e9af7d in ?? ()
#1  0x0000559c36515618 in ?? ()
#2  0x0000559c3c24cdf0 in ?? ()
#3  0x0000559c33dfabb0 in ?? ()
#4  0x0000559c37a805c0 in ?? ()
#5  0x0000559c37a805d0 in ?? ()
#6  0x0000559c3c24cdf8 in ?? ()
#7  0x0000559c37a805c0 in ?? ()
#8  0x864d739f49495f00 in ?? ()
#9  0x00007ffe1dfe14e0 in ?? ()
#10 0x0000559c33ea7208 in ?? ()
#11 0x0000003f1dfe0047 in ?? ()
#12 0x0000559c3b4ca510 in ?? ()
#13 0x0000559c36515618 in ?? ()
#14 0x0000559c33ea003f in ?? ()
#15 0x0000559c33e92c7d in ?? ()
#16 0x864d739f49495f00 in ?? ()
#17 0x0000559c3b4ca610 in ?? ()
#18 0x0000559c3b347e88 in ?? ()
#19 0x00007ffe1dfe1540 in ?? ()
#20 0x0000559c33ea737d in ?? ()
#21 0x0000ffff1dfeffff in ?? ()
#22 0x0000003f33ea0047 in ?? ()
#23 0x00007ffe1dfe16b0 in ?? ()
#24 0x0000559c3b4ca510 in ?? ()
#25 0xffffffff1dfe1540 in ?? ()
#26 0x0000000000000004 in ?? ()
#27 0x0000559c36515618 in ?? ()
#28 0x00007ffe0045003f in ?? ()
#29 0x0000000000000004 in ?? ()
#30 0x864d739f49495f00 in ?? ()
#31 0x00007ffe1dfe1580 in ?? ()
#32 0x0000559c33ea757c in ?? ()
#33 0x0000000000000000 in ?? ()
mewmew commented 6 years ago

I recompiled DGEngine with debug info and tried running it directly from gdb to get a backtrace. However, bt gives me "No stack" after the crash. Note, this crash happened in town, I had simply been walking around for a while (about 1 minute) above Griswold's shop.

[u@x220 DGEngine]$ gdb ./DGEngine
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./DGEngine...done.
(gdb) run ./gamefilesd
Starting program: /home/u/Desktop/diablo/DGEngine/DGEngine ./gamefilesd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffdb689700 (LWP 19764)]
[Thread 0x7fffdb689700 (LWP 19764) exited]
[New Thread 0x7fffdb689700 (LWP 19766)]
[Thread 0x7fffdb689700 (LWP 19766) exited]
[New Thread 0x7fffdb689700 (LWP 19767)]
[New Thread 0x7fffdae88700 (LWP 19768)]
[New Thread 0x7fffda3dd700 (LWP 19769)]
[Thread 0x7fffda3dd700 (LWP 19769) exited]
[New Thread 0x7fffda3dd700 (LWP 19770)]
[Thread 0x7fffda3dd700 (LWP 19770) exited]
[New Thread 0x7fffda3dd700 (LWP 19771)]
i965: Failed to submit batchbuffer: Input/output error
AL lib: (EE) alc_cleanup: 1 device not closed
[Thread 0x7fffda3dd700 (LWP 19771) exited]
[Thread 0x7fffdae88700 (LWP 19768) exited]
[Thread 0x7ffff7f79880 (LWP 19760) exited]
[Inferior 1 (process 19760) exited with code 01]
(gdb) bt
No stack.
(gdb) disassemble 
No frame selected.
(gdb) info registers 
The program has no registers now.

Any idea how I could try to provide some more useful info for troubleshooting?

ghost commented 6 years ago

I committed another fix (and reverted the previous one).

It was a tricky one. The previous fix (checking for nullptr) shouldn't have been needed because that pointer is never null. The shared_ptr was being destroyed where I applied the new fix, so I had to change the code to force the returned shared_ptr into an lvalue.

I'll have to read about this particular situation, as either gcc or msvc should have given me a warning on those lines.

mewmew commented 6 years ago

Now the player graphics works (at rev 041a6f637e0b640beb932ac2b6fb723db102aa03), they no longer disappear when walking closer to the Church. However, after walking for 2 or 3 minutes the game freezes up, then again after another minute or so, and eventually it crashes. I tried to run it from gdb to catch a back trace (but was not able as it states "No stack").

(gdb) run ./gamefilesd
Starting program: /home/u/Desktop/diablo/DGEngine/DGEngine ./gamefilesd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7fffdb689700 (LWP 3492)]
[Thread 0x7fffdb689700 (LWP 3492) exited]
[New Thread 0x7fffdb689700 (LWP 3494)]
[Thread 0x7fffdb689700 (LWP 3494) exited]
[New Thread 0x7fffdb689700 (LWP 3495)]
[New Thread 0x7fffd6e88700 (LWP 3496)]
[New Thread 0x7fffd63dd700 (LWP 3497)]
[Thread 0x7fffd63dd700 (LWP 3497) exited]
[New Thread 0x7fffd63dd700 (LWP 3511)]
[Thread 0x7fffd63dd700 (LWP 3511) exited]
[New Thread 0x7fffd63dd700 (LWP 3566)]
i965: Failed to submit batchbuffer: Input/output error
AL lib: (EE) alc_cleanup: 1 device not closed
[Thread 0x7fffd63dd700 (LWP 3566) exited]
[Thread 0x7fffd6e88700 (LWP 3496) exited]
[Thread 0x7ffff7f79880 (LWP 3475) exited]
[Inferior 1 (process 3475) exited with code 01]
(gdb) bt
No stack.
(gdb) info registers 
The program has no registers now.

If you know of a way in which I can assist you with more information to troubleshoot just let me know.

And thanks for working on what seems to be a rather tricky issue.

Cheers! /u

ghost commented 6 years ago

I'll try and find out why this happens. When I test, I don't get any crash.

Can you give me your version numbers of: SFML, gcc?

mewmew commented 6 years ago

Can you give me your version numbers of: SFML, gcc?

Sure.

u@x220 ~> pacman -Q sfml
sfml 2.4.2-4
u@x220 ~> gcc --version
gcc (GCC) 7.2.1 20171224
ghost commented 6 years ago

Latest commit should fix the crash.

mewmew commented 6 years ago

The window goes black after walking around for 20-30 seconds for a while and the game freezes (with music still playing), comes back after 10 or so seconds. This happened about 3 times, before the window went black again (with music playing) and stayed that way until the game crashed. This happened on rev f1b0c41de88fab49199c8140ba8e586e5cd494f6 with a new game character, and without interacting with items or NPCs. Simply walking towards the Church. First freeze just above Griswold. Second freeze on the path to Church. Third freeze and crash same place on the path, having stayed there for about 30 seconds since last freeze.

The crash fails to produce a core dump for some reason, this has happened since the crashes started to appear about three or four revisions ago.

ghost commented 6 years ago

When I test in Linux I used to get the crash (also in windows, if using clang) even on the first commit (I had ignored it originally because I thought it was something else at the time), but I don't get any now (I pick items, go to dungeons, identify items, stay in town for a few minutes, etc).

I'll have to test further.

Maybe you can try valgrind to see if it tells you something more.

Also, how are you loading the game files? are you using the original diablo DIABDAT.MPQ? If you also have the extracted folder in the root, rename it or remove it to force the engine to use the MPQ file. It may be that some file isn't loaded properly.

Other tests you can do (if you have time):

mewmew commented 6 years ago

When I test in Linux I used to get the crash (also in windows, if using clang) even on the first commit (I had ignored it originally because I thought it was something else at the time), but I don't get any now (I pick items, go to dungeons, identify items, stay in town for a few minutes, etc).

Glad to hear its working on your end now using Linux.

I'll have to test further.

Maybe you can try valgrind to see if it tells you something more.

I'll give it a shot.

Also, how are you loading the game files? are you using the original diablo DIABDAT.MPQ? If you also have the extracted folder in the root, rename it or remove it to force the engine to use the MPQ file. It may be that some file isn't loaded properly.

I've installed your version of PhyricsFS with the following PKGBUILD for Arch Linux, and I have removed the unpacked directory while testing as there were issues in the past with lower case filenames. So using DIABDAT.MPQ.

u@x220 ~/D/d/DGEngine> sha1sum DIABDAT.MPQ 
5cfd971abb25602731fef0c9b43eb7d7447f296e  DIABDAT.MPQ

Other tests you can do (if you have time):

  • use clang and see if you get the same crash or a compilation error.
  • use SFML from trunk or an older 2.4.0 version.

Tried building DGEngine now using Clang (3.9.1). Will try later on 5.0.1.

Crashes still when built with Clang 3.9.1.

[u@x220 DGEngine]$ ./DGEngine ./gamefilesd
i965: Failed to submit batchbuffer: Input/output error
AL lib: (EE) alc_cleanup: 1 device not closed

The error above has been reported for the crashes, and no core dump has been produce.

I wonder if it may be related to "Failed to submit batchbuffer: Input/output error"? At first I didn't think any of the two errors above had anything to do with the cause of the crash, but rather a result of the crash. I.e. device not closed.

mewmew commented 6 years ago

I tried running DGEngine through valgrind but it seems to be a Heisenbug, as it was not possible to reproduce when running through valgrind. For the two prior runs, without valgrind the game crashed only seconds after entering Tristram with a new character.

Log at: https://pastebin.com/raw/f81A77zQ

ghost commented 6 years ago

Looking at the log above and searching for "Failed to submit batchbuffer: Input/output error" online, I think the problem is with either SFML or the OpenGL driver you're using, as most of those invalid writes are in draw calls (which are not part of DGEngine but SFML). Searching for the error reveals many topics related to drivers and X server.

I don't have any problems, but I run it in a Linux VM with an OpenGL software render.

Try to run in with a software OpenGL render, if you can. Also try a different version of SFML (>= 2.4.0).

Edit: I pushed some changes to DGEngine and physfs to plug some leaks reported in the log above. The only leaks reported by valgrind now are related to SFML.

mewmew commented 6 years ago

Software rendering did the trick! Thanks a lot for your patience in tracking down the cause of this issue!

I guess my GPU drivers are broken or something. In either case, DGEngine runs perfectly with software rendering enabled.

$ export LIBGL_ALWAYS_SOFTWARE=1
$ ./DGEngine ./gamefilesd

Closing this issue as it is not related to DGEngine, but rather by GPU drivers.

Cheers, /u