cmangos / issues

This repository is used as a centralized point for all issues regarding CMaNGOS.
180 stars 47 forks source link

Core Segfault when entering mage portal with playerbots #2126

Closed Mr-Deadbeat closed 4 years ago

Mr-Deadbeat commented 4 years ago

🐛 Bugreport

Game Crash when using portal with playerbots in party

Expected behavior

click and off we go

Version & Environment

OS Ubuntu 18.04.4 64bit Server

Client Version: "2.4.3" (TBC)

CMaNGOS Repo & Commit Hash:

TBC

Database Repo & Commit Hash:

TBC

Operating System:

Linux Flavor

Steps to reproduce

Create portal with playerbots in party click portal profit!!!

Crashlog

(gdb) info threads Id Target Id Frame 1 Thread 0x7ffff7fe1740 (LWP 22387) "mangosd" 0x00007ffff7bc7c60 in GI___nanosleep (requested_time=0x7fffffffe030, remaining=0x7fffffffe030) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 2 Thread 0x7ffff5b42700 (LWP 22391) "mangosd" 0x00007ffff7bc7c60 in GI_nanosleep (requested_time=0x7ffff5b41d10, remaining=0x7ffff5b41d10) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 3 Thread 0x7ffff5341700 (LWP 22392) "mangosd" 0x00007ffff7bc7c60 in GI_nanosleep (requested_time=0x7ffff5340d10, remaining=0x7ffff5340d10) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 4 Thread 0x7ffff4b40700 (LWP 22393) "mangosd" 0x00007ffff7bc7c60 in GI___nanosleep (requested_time=0x7ffff4b3fd10, remaining=0x7ffff4b3fd10) at ../sysdeps/unix/sysv/linux/nanosleep.c:28 8 Thread 0x7fffdd803700 (LWP 22397) "mangosd" 0x00007ffff7bc39f3 in futex_wait_cancelable (private=, expected=0, futex_word=0x55556b8f9270) at ../sysdeps/unix/sysv/linux/futex-internal.h:88 9 Thread 0x7fffe3fff700 (LWP 22398) "mangosd" 0x00007ffff7bc39f3 in futex_wait_cancelable (private=, expected=0, futex_word=0x55556b8f9270) at ../sysdeps/unix/sysv/linux/futex-internal.h:88 10 Thread 0x7fffecb94700 (LWP 22399) "mangosd" 0x00007ffff7bc39f3 in futex_wait_cancelable (private=, expected=0, futex_word=0x55556b8f9270) at ../sysdeps/unix/sysv/linux/futex-internal.h:88

(gdb) bt

0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51

1 0x00007ffff6226801 in __GI_abort () at abort.c:79

2 0x00007ffff621639a in __assert_fail_base (

fmt=0x7ffff639d7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
assertion=assertion@entry=0x55555696e175 "STRINGIZE(m_currMap) && 0", 
file=file@entry=0x55555696e0e0 "/home/gameadm/source/mangosbc/src/game/Entities/Object.h", line=line@entry=874, 
function=function@entry=0x555556970260 <WorldObject::GetMap() const::__PRETTY_FUNCTION__> "Map* WorldObject::GetMap() const") at assert.c:92

3 0x00007ffff6216412 in __GI___assert_fail (

assertion=0x55555696e175 "STRINGIZE(m_currMap) && 0", 
file=0x55555696e0e0 "/home/gameadm/source/mangosbc/src/game/Entities/Object.h", line=874, 
function=0x555556970260 <WorldObject::GetMap() const::__PRETTY_FUNCTION__> "Map* WorldObject::GetMap() const") at assert.c:101

4 0x000055555603e35e in WorldObject::GetMap (this=0x7fffd723f300)

at /home/gameadm/source/mangosbc/src/game/Entities/Object.h:874

5 0x00005555563de00e in PlayerbotMgr::HandleMasterIncomingPacket (

this=0x7fffc000adf0, packet=...)
at /home/gameadm/source/mangosbc/src/game/PlayerBot/Base/PlayerbotMgr.cpp:378

6 0x000055555642b828 in WorldSession::Update (this=0x7fffb00028f0, diff=66,

updater=...)

---Type to continue, or q to quit--- at /home/gameadm/source/mangosbc/src/game/Server/WorldSession.cpp:300

7 0x00005555565029c1 in World::UpdateSessions (this=0x5555571172b0, diff=66)

at /home/gameadm/source/mangosbc/src/game/World/World.cpp:1968

8 0x0000555556500afa in World::Update (this=0x5555571172b0, diff=66)

at /home/gameadm/source/mangosbc/src/game/World/World.cpp:1472

9 0x0000555555fd3b89 in WorldRunnable::run (this=0x55556a0d36a0)

at /home/gameadm/source/mangosbc/src/mangosd/WorldRunnable.cpp:58

10 0x000055555600c1cd in MaNGOS::Thread::ThreadTask (param=0x55556a0d36a0)

at /home/gameadm/source/mangosbc/src/shared/Threading.cpp:84

11 0x000055555600c759 in std::__invoke_impl<void, void ()(void), void*> (

__f=@0x55556a0d5d90: 0x55555600c1a2 <MaNGOS::Thread::ThreadTask(void*)>, 
__args#0=@0x55556a0d5d88: 0x55556a0d36a0)
at /usr/include/c++/7/bits/invoke.h:60

12 0x000055555600c44a in std::__invoke<void ()(void), void*> (

__fn=@0x55556a0d5d90: 0x55555600c1a2 <MaNGOS::Thread::ThreadTask(void*)>, 
__args#0=@0x55556a0d5d88: 0x55556a0d36a0)
at /usr/include/c++/7/bits/invoke.h:95

13 0x000055555600ca0f in std::thread::_Invoker<std::tuple<void ()(void), void*> >::_M_invoke<0ul, 1ul> (this=0x55556a0d5d88)

at /usr/include/c++/7/thread:234

14 0x000055555600c9b0 in std::thread::_Invoker<std::tuple<void ()(void), void*> >::operator() (this=0x55556a0d5d88) at /usr/include/c++/7/thread:243

15 0x000055555600c980 in std::thread::_State_impl<std::thread::_Invoker<std::tu---Type to continue, or q to quit---

ple<void ()(void), void*> > >::_M_run (this=0x55556a0d5d80) at /usr/include/c++/7/thread:186

16 0x00007ffff6c4a66f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

17 0x00007ffff7bbd6db in start_thread (arg=0x7fffce458700)

at pthread_create.c:463

18 0x00007ffff630788f in clone ()

at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
berserkingyadis commented 4 years ago

Could not replicate on latest core and db on Windows 10.

My characters:

  1. human mage (portal maker)
  2. human warrior
  3. human rogue

Did portal from forest of elwynn to exodar and from exodar to darnassus without crash

image

Please be more concrete how u replicated that error.

https://github.com/cmangos/mangos-tbc/commit/4e5e7bf688ba3ccdd7026f30b3cb42fdb4a4a70b https://github.com/cmangos/tbc-db/commit/82e3d53c038f015d2a86ee280a4e3600ff248dac

System: Windows10

cala commented 4 years ago

Thanks for the report. We had this issue long ago, but I fixed it and did not have any problem ever since. Could you describe more precisely the steps to reproduce the issue? Who was casting the portal ? A player ? A bot ? Were them in the same party ?

Mr-Deadbeat commented 4 years ago

i played as the mage "Gnome" had 4 bots in party Human Warrior, Human Paladin, Human Priest and Gnome Warlock cast the portal and right clicked it and the server segfaulted both times it happened in Tanaris just outside the city walls of gadgetzan

also this happened on OS Ubuntu 18.04.4 Server

game client was run on Windows 10

Group Portal cast to Ironforge

forgot to add the warlock pet at the time which was Felhunter

hope this clarifies things :)

Mr-Deadbeat commented 4 years ago

another update this crash seems to only happen in Tanaris Area tried same race group combo from Elwynn Forest to Ironforge no segfault did it twice to make sure no issues on both times

another update this only seems to happen on cross continent portals tried Auberdine to ironforge and end result was segfault doing portals on eastern kingdom side does not produce segfaults not at least from the locations i have tried blasted lands and elwynn forest my mage doesn't have the Darnassus portal yet so can't test eastern kingdoms to Kalimdor travel

berserkingyadis commented 4 years ago

Replicated crash on Archlinux.

Portal from Auberdine/Darnassus->Ironforge or vice versa crashed the core. Same error as @Mr-Deadbeat

https://github.com/cmangos/mangos-tbc/commit/4e5e7bf688ba3ccdd7026f30b3cb42fdb4a4a70b https://github.com/cmangos/tbc-db/commit/82e3d53c038f015d2a86ee280a4e3600ff248dac

berserkingyadis commented 4 years ago

Appears to be a Linux thing?

cala commented 4 years ago

Confirmed on all cores.

The crash happens when the player uses the portal GO and is teleported. CMSG_GAMEOBJ_USE is sent and Playerbot code then checks if there is a difference between the bot's map and the player's map to teleport the bot to the master. The crashs happens when checking the new player's map.

This code was previously working (in late 2019 at least) and now it raises an error when checking the map of the player if it has changed (no error if the player is teleported to the same map).

There was no change to Playerbot code in this part for months if not years, so I assume the error comes from a recent change in the way players are teleported or change map.

To be investigated...

cala commented 4 years ago

Invoking @killerwife: does this ring a bell to you?

I quickly scrolled through Git history but found nothing that stands out at first glance.

cyberium commented 4 years ago

That code is now unsafe due to multi threading update.

Some part have to be handled a bit differently.

Can you please try to replace

                if (bot->GetMap() != m_master->GetMap())
                    return;

                GameObject* obj = m_master->GetMap()->GetGameObject(objGUID);

can be found at Line 378 in src/game/PlayerBot/Base/PlayerbotMgr.cpp

by

                Map* masterMap = m_master->GetMap();
                if (!masterMap || bot->GetMap() != masterMap || m_master->IsBeingTeleported())
                    return;

                GameObject* obj = masterMap->GetGameObject(objGUID);

Probably more of that kind will be found in the future.

cala commented 4 years ago

Sorry. You are right, I forgot to mention that indeed the segfault occurs at line

if (bot->GetMap() != m_master->GetMap())

in src/game/PlayerBot/Base/PlayerbotMgr.cpp, more precisely when evaluating m_master->GetMap(). bot->GetMap() still returns the correct result as far as my tests went yesterday evening.

I'll test your code suggestion today or tonight, depending on available free time.

Probably more of that kind will be found in the future.

I hope not... :confused:

cala commented 4 years ago

@cyberium : tested right now and the result is unfortunately the same: when evaluating m_master->GetMap() there is a call to

Object::GetMap() const { MANGOS_ASSERT(m_currMap); return m_currMap; }

At the time m_master->GetMap() is evaluated, m_currMap is null for the player, leading to the assertion in GetMap() to fail. 😕

I'm afraid we need another check here.

cala commented 4 years ago

Closed in https://github.com/cmangos/mangos-classic/commit/6c2c6ca1954bd55efae1b45c35b2f42f7ed8e5d2 Will be ported soon to TBC and WotLK. Thanks @cyberium !

wanglinb741 commented 3 weeks ago

[Uploading 910d5944mangosd.exe[30-9_13-18-54].txt…]()

wanglinb741 commented 3 weeks ago

worldobject:: getmap (): m_currmap