otland / forgottenserver

A free and open-source MMORPG server emulator written in C++
https://otland.net
GNU General Public License v2.0
1.58k stars 1.05k forks source link

An old TFS crash unsolved #2682

Open caioabranches2009 opened 4 years ago

caioabranches2009 commented 4 years ago

Hello, recently I got a crash on my TFS related to item destructor.

The referenced line is:

item->setParent(nullptr);

My teorie is that item is being removed from memory but without updating inventory vector from players.cpp.

So, player destructor will try to use a pointer for a item on inventory vector, and crashes server (read memory access violation).

If confirmed, it's present on every TFS. Maybe an old unknow bug or a very rare bug.

Thread 2 (Thread 0x7ffff5a51700 (LWP 31311)):
#0  0x00007fffce3ca680 in ?? ()
No symbol table info available.
#1  0x000000000059419e in Player::~Player (this=0x7fffcddc69e0) at /home/otserv1064/tfs/src/player.cpp:60
        item = 0x7fffce3ca690
        __for_range = @0x7fffcddc6fe8: {0x0, 0x7fffce6a7070, 0x0, 0x7fffcfa39c70, 0x0, 0x7fffce916dc0, 0x7fffce3ca690, 0x0, 0x0, 0x0, 0x7fffccfd5880}
        __for_begin = 0x7fffcddc7018
        __for_end = 0x7fffcddc7040
#2  0x00000000005943a2 in Player::~Player (this=0x7fffcddc69e0) at /home/otserv1064/tfs/src/player.cpp:71
No locals.
#3  0x000000000065009a in Creature::decrementReferenceCounter (this=0x7fffcddc69e0) at /home/otserv1064/tfs/src/creature.h:449
No locals.
#4  0x0000000000604f41 in Game::cleanup (this=0xa40ce0 <g_game>) at /home/otserv1064/tfs/src/game.cpp:3849
        creature = 0x7fffcddc69e0
        __for_range = Python Exception <class 'RuntimeError'> Type is not a template.: 
@0xa411a8: Python Exception <class 'RuntimeError'> Type is not a template.: 
{<No data fields>}
        __for_begin = 
        __for_end = 
#5  0x000000000060a212 in Game::checkCreatures (this=0xa40ce0 <g_game>, index=1) at /home/otserv1064/tfs/src/game.cpp:3199
        __func__ = "checkCreatures"
        checkCreatureList = @0xa410c8: {<No data fields>}
        it = Python Exception <class 'ValueError'> Cannot find type std::_List_iterator::_Node: 

        end = Python Exception <class 'ValueError'> Cannot find type std::_List_iterator::_Node: 

#6  0x000000000060b2ba in operator() (this=0x7fffce6b92d0, __object=0xa40ce0 <g_game>, __args#0=@0x7fffce6b92e0: 1) at /usr/include/c++/5/functional:600
No locals.
#7  0x000000000060f9bf in __call (this=0x7fffce6b92d0, __args=<unknown type in /home/otserv1064/tfs/tfs, CU 0x21b4b4, DIE 0x234c81>) at /usr/include/c++/5/functional:1074
No locals.
#8  0x0000000000615784 in std::_Bind<std::_Mem_fn<void (Game::*)(unsigned long)> (Game*, unsigned long)>::operator()<, void>() [clone .lto_priv.1926] (this=0x7fffce6b92d0) at /usr/include/c++/5/functional:1133
No locals.
#9  0x00000000005f6839 in std::_Function_handler<void (), std::_Bind<std::_Mem_fn<void (Game::*)(unsigned long)> (Game*, unsigned long)> >::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/5/functional:1871
No locals.
#10 0x0000000000570ffc in std::function::operator() (this=0x7fffceb4c968) at /usr/include/c++/5/functional:2267
No locals.
#11 0x000000000057101a in Task::operator() (this=0x7fffceb4c910) at /home/otserv1064/tfs/src/tasks.h:41
No locals.
#12 0x000000000058219b in Dispatcher::threadMain (this=0xa377c0 <g_dispatcher>) at /home/otserv1064/tfs/src/tasks.cpp:71
        task = 0x7fffceb4c910
        taskLockUnique = {_M_device = 0xa377d8 <g_dispatcher+24>, _M_owns = false}
        time_point = {__d = {__r = 1571575320674003775}}
#13 0x000000000058538f in std::_Mem_fn_base::operator() (this=0xabc6d0, __object=0xa377c0 <g_dispatcher>) at /usr/include/c++/5/functional:600
No locals.
#14 0x00000000005896e5 in std::_Bind_simple::_M_invoke (this=0xabc6c8) at /usr/include/c++/5/functional:1531
No locals.
#15 0x0000000000589709 in std::_Bind_simple::operator() (this=0xabc6c8) at /usr/include/c++/5/functional:1520
No locals.
#16 0x000000000058972c in std::thread::_Impl::_M_run (this=0xabc6b0) at /usr/include/c++/5/thread:115
No locals.
#17 0x00007ffff6449c80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#18 0x00007ffff671a6ba in start_thread (arg=0x7ffff5a51700) at pthread_create.c:333
        __res = <optimized out>
        pd = 0x7ffff5a51700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737314625280, -8600867084247919482, 0, 140737488347775, 140737314625984, 0, 8600889845969949830, 8600881430831535238}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#19 0x00007ffff7b1441d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
No locals.
nekiro commented 4 years ago

I guess you don't know how to reproduce that right?

Leo32onGIT commented 4 years ago

Related to this maybe #2238

Zbizu commented 3 years ago

@Leo32onGIT indeed, the log shows a ::cleanup mentioned in the issue you linked. I have no idea what could be done to reproduce this, but the fix for that problem is in that post you linked.

caioabranches2009 commented 2 years ago

@Leo32onGIT indeed, the log shows a ::cleanup mentioned in the issue you linked. I have no idea what could be done to reproduce this, but the fix for that problem is in that post you linked.

there's no fix for it actually, it's openned to find a solution on tfs 1.2 repo still happening, and a tip from another user that's facing it is related to item remove function using Game::internalRemoveItem, more specifically on training weapons that use remove and move functions, like spears and small stones

on my logs, last crash i got that is really a item used on right hand with move or remove option

gesior commented 2 years ago

How to reproduce [I get this procedure from guy on Discord, not tested]:

Some idea: https://github.com/otland/forgottenserver/blob/master/src/game.cpp#L3815-L3817

                creature->onThink(EVENT_CREATURE_THINK_INTERVAL);
                creature->onAttacking(EVENT_CREATURE_THINK_INTERVAL);
                creature->executeConditions(EVENT_CREATURE_THINK_INTERVAL);

It first does onThink, which kick idle player and 'remove' him. Then it does 'attacking' - even, if player is already marked as removed. I did not check how it affects item movement system. For some reason moving Spear from slot to ground may change Item which is in slot and for some reason Player (which is 'parent' of that Item) may not get information about it, as Player* is already marked as 'removed' (when you check it with isRemoved()).

This crash probably happens only on OTSes that has distance weapons dropping on floor. Like some downgrade 7.x OTSes where Spear drops on floor, when you throw it and you got to pick it up again.

EDIT: It looks like replacing order of functions fix issue (first 'attack', then 'think'):

                creature->onAttacking(EVENT_CREATURE_THINK_INTERVAL);
                creature->onThink(EVENT_CREATURE_THINK_INTERVAL);
caioabranches2009 commented 2 years ago

Awesome. @gesior. My server was a 7x downgraded, and the crash I had the item was on right hand slot. So, it makes sense. What's the next steps?

Tested trying to reproduce but did not crash:

a) Attacked a player and spear dropped on ground; b) Got spear again in hands d) Waited kick for iddle time.

EDIT:

Tested again and it crashes with the same backtrace of this thread.

EDIT 2:

Tested with this possible fix and it worked, no more crashes.

gesior commented 2 years ago

@DSpeichert Can you change labels to 'confirmed' or 'bug'? We got 2 confirmations of that bug and reproduction scenario.

I also posted fix in post above. I'm not sure, if it's fine for everyone. Maybe we should add some extra isRemoved() checks in executeXxx functions.

caioabranches2009 commented 2 years ago
#0  0x0000000000000025 in ?? ()
No symbol table info available.
#1  0x00005555556841ed in MoveEvents::onItemMove (this=0x7fffe4047c90, item=0x7fffd671ac50, tile=0x7fff74157710, isAdd=true) at /home/tfs/src/movement.cpp:365
        thing = 0x7fffce60e9d0
        tileItem = 0x7fff741576e0
        i = 1
        j = 17
        eventType1 = MOVE_EVENT_ADD_ITEM
        eventType2 = MOVE_EVENT_ADD_ITEM_ITEMTILE
        ret = 1
        moveEvent = 0x0
#2  0x000055555566e95b in Tile::postAddNotification(Thing*, Cylinder const*, int, cylinderlink_t) (this=0x7fff74157710, thing=0x7fffd671ac50, oldParent=0x0, index=2, link=LINK_OWNER) at /home/tfs/src/tile.cpp:1328
        list = {_M_h = <incomplete type>}
        creature = 0x0
        item = 0x7fffd671ac50
#3  0x00005555557d7857 in Game::internalAddItem(Cylinder*, Item*, int, unsigned int, bool, unsigned int&) (this=0x555555bb3aa0 <g_game>, toCylinder=0x7fff74157710, item=0x7fffd671ac50, index=-1, flags=1, test=false, remainderCount=@0x7ffff28d0454: 0) at /home/tfs/src/game.cpp:1241
        itemIndex = 2
        destCylinder = 0x7fff74157710
        toItem = 0x7fff52a10730
        ret = RETURNVALUE_NOERROR
        maxQueryCount = 1
#4  0x00005555557d78e0 in Game::internalAddItem(Cylinder*, Item*, int, unsigned int, bool) (this=0x555555bb3aa0 <g_game>, toCylinder=0x7fff74157710, item=0x7fffd671ac50, index=-1, flags=1, test=false) at /home/tfs/src/game.cpp:1168
        remainderCount = 0
#5  0x00005555557d7a38 in Game::combatGetTypeInfo(CombatType_t, Creature*, TextColor_t&, unsigned char&) (this=0x555555bb3aa0 <g_game>, combatType=COMBAT_PHYSICALDAMAGE, target=0x7fff3bd56170, color=@0x7ffff28d04ea: TEXTCOLOR_RED, effect=@0x7ffff28d04eb: 1 '\001') at /home/tfs/src/game.cpp:3443
        splash = 0x7fffd671ac50
#6  0x00005555557df03e in Game::combatChangeHealth(Creature*, Creature*, CombatDamage&) (this=0x555555bb3aa0 <g_game>, attacker=0x7fff4ca17c30, target=0x7fff3bd56170, damage=...) at /home/tfs/src/game.cpp:3595
        attackerPlayer = 0x7fff4ca17c30
        healthChange = 40
        list = {_M_h = {_M_buckets = 0x7fff12e7b380, _M_bucket_count = 3, _M_before_begin = {_M_nxt = 0x7fffdad24010}, _M_element_count = 2, _M_rehash_policy = {_M_max_load_factor = 1, _M_next_resize = 3}, _M_single_bucket = 0x0}}
        color = TEXTCOLOR_RED
        realDamage = 7
        targetHealth = 7
        hitEffect = 1 '\001'
        targetPos = @0x7fff3bd56178: {x = 33299, y = 32291, z = 9 '\t'}
#7  0x000055555585a4ff in Combat::CombatHealthFunc(Creature*, Creature*, CombatParams const&, CombatDamage*) (caster=0x7fff4ca17c30, target=0x7fff3bd56170, params=..., data=0x7ffff28d0890) at /home/tfs/src/combat.cpp:486
        damage = {type = COMBAT_PHYSICALDAMAGE, value = 7, min = 0, max = 0}
#8  0x000055555585b212 in Combat::doCombatHealth(Creature*, Creature*, CombatDamage&, CombatParams const&) (caster=0x7fff4ca17c30, target=0x7fff3bd56170, damage=..., params=...) at /home/tfs/src/combat.cpp:827
        canCombat = true
#9  0x00005555557a4748 in Weapon::internalUseWeapon(Player*, Item*, Creature*, int) const (this=0x7fffe4104990, player=0x7fff4ca17c30, item=0x7fff36707090, target=0x7fff3bd56170, damageModifier=100) at /home/tfs/src/weapons.cpp:377
        damage = {type = COMBAT_PHYSICALDAMAGE, value = -80, min = 0, max = 0}
#10 0x00005555557a5505 in WeaponMelee::useWeapon(Player*, Item*, Creature*) const (this=0x7fffe4104990, player=0x7fff4ca17c30, item=0x7fff36707090, target=0x7fff3bd56170) at /home/tfs/src/weapons.cpp:518
        damageModifier = 100
#11 0x00005555556f1fdb in Player::doAttacking(unsigned int) (this=0x7fff4ca17c30) at /home/tfs/src/player.cpp:3106
        result = false
        tool = 0x7fff36707090
        weapon = 0x7fffe4104990
        __func__ = "doAttacking"
#12 0x000055555581faa1 in Creature::onAttacking(unsigned int) (this=0x7fff4ca17c30, interval=1000) at /home/tfs/src/creature.cpp:176
No locals.
#13 0x00005555557df733 in Game::checkCreatures(unsigned long) (this=0x555555bb3aa0 <g_game>, index=3) at /home/tfs/src/game.cpp:3280
        creature = 0x7fff4ca17c30
        __func__ = "checkCreatures"
        checkCreatureList = @0x555555bb3ec0: {<No data fields>}
        it = {_M_node = 0x7fff407ba440}
        end = {_M_node = 0x555555bb3ec0 <g_game+1056>}
#14 0x00005555557adda4 in std::__invoke_impl (__f=@0x7fffda9b4160: (void (Game::*)(Game * const, unsigned long)) 0x5555557df4b8 <Game::checkCreatures(unsigned long)>, __t=@0x7fffda9b4178: 0x555555bb3aa0 <g_game>, __args#0=@0x7fffda9b4170: 3) at /usr/include/c++/7/bits/invoke.h:73
No locals.
#15 0x00005555557ae4e9 in std::__invoke (__fn=@0x7fffda9b4160: (void (Game::*)(Game * const, unsigned long)) 0x5555557df4b8 <Game::checkCreatures(unsigned long)>, __args#0=@0x7fffda9b4178: 0x555555bb3aa0 <g_game>, __args#1=@0x7fffda9b4170: 3) at /usr/include/c++/7/bits/invoke.h:95
No locals.
#16 0x00005555557af97e in std::_Bind::__call (this=0x7fffda9b4160, __args=...) at /usr/include/c++/7/functional:467
No locals.
#17 0x00005555557b6761 in std::_Bind<void (Game::*(Game*, unsigned long))(unsigned long)>::operator()<, void>() (this=0x7fffda9b4160) at /usr/include/c++/7/functional:551
No locals.
#18 0x00005555557c0b8e in std::_Function_handler<void (), std::_Bind<void (Game::*(Game*, unsigned long))(unsigned long)> >::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316
No locals.
#19 0x0000555555662846 in operator() (this=0x7fffce2f2d38) at /usr/include/c++/7/bits/std_function.h:706
No locals.
#20 0x0000555555662864 in operator() (this=0x7fffce2f2ce0) at /home/tfs/src/tasks.h:41
No locals.
#21 0x0000555555670934 in Dispatcher::threadMain() (this=0x555555ba9a20 <g_dispatcher>) at /home/tfs/src/tasks.cpp:71
        task = 0x7fffce2f2ce0
        taskLockUnique = {_M_device = 0x555555ba9a38 <g_dispatcher+24>, _M_owns = false}
        time_point = {__d = {__r = 1648000752194941877}}
#22 0x0000555555694141 in std::__invoke_impl (__f=@0x555555bc9320: (void (Dispatcher::*)(Dispatcher * const)) 0x5555556707a0 <Dispatcher::threadMain()>, __t=@0x555555bc9318: 0x555555ba9a20 <g_dispatcher>) at /usr/include/c++/7/bits/invoke.h:73
No locals.
#23 0x000055555569440a in std::__invoke (__fn=@0x555555bc9320: (void (Dispatcher::*)(Dispatcher * const)) 0x5555556707a0 <Dispatcher::threadMain()>, __args#0=@0x555555bc9318: 0x555555ba9a20 <g_dispatcher>) at /usr/include/c++/7/bits/invoke.h:95
No locals.
#24 0x000055555569446f in std::thread::_Invoker::_M_invoke (this=0x555555bc9318) at /usr/include/c++/7/thread:234
No locals.
#25 0x00005555556944a2 in std::thread::_Invoker::operator() (this=0x555555bc9318) at /usr/include/c++/7/thread:243
No locals.
#26 0x00005555556944d8 in std::thread::_State_impl::_M_run (this=0x555555bc9310) at /usr/include/c++/7/thread:186
No locals.
#27 0x00007ffff60e76df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#28 0x00007ffff63ba6db in start_thread (arg=0x7ffff28d1700) at pthread_create.c:463
        pd = 0x7ffff28d1700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737262720768, -7239375647147944522, 140737262718848, 0, 93824999002896, 140737488346064, 7239385967380010422, 7239393752809608630}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#29 0x00007ffff7b0361f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.

@gesior I don't know if it can be related, but I remember this other crash that seens to be the same procedure of this one. If we follow the call stack:

a) check creatures runs; b) did an attack to a possible mob or player; c) mob or player died; d) internal add item to tile (maybe spear?) e) onItemMove crash.

The difference that it crashed on movements.cpp. What do you think?

gesior commented 2 years ago
The difference that it crashed on movements.cpp. What do you think?

Looks like not related. Maybe there is some movements script in Lua that remove item [or 2 of them on given tile]. Item added is 'splash' (blood). Code adds item with FLAG_NOLIMIT and expect if will be added - does not check 'return value':

            if (splash) {
                internalAddItem(target->getTile(), splash, INDEX_WHEREEVER, FLAG_NOLIMIT);
                startDecay(splash);
            }

Maybe it fails, if item is removed by movements Lua script.

caioabranches2009 commented 2 years ago
The difference that it crashed on movements.cpp. What do you think?

Looks like not related. Maybe there is some movements script in Lua that remove item [or 2 of them on given tile]. Item added is 'splash' (blood). Code adds item with FLAG_NOLIMIT and expect if will be added - does not check 'return value':

          if (splash) {
              internalAddItem(target->getTile(), splash, INDEX_WHEREEVER, FLAG_NOLIMIT);
              startDecay(splash);
          }

Maybe it fails, if item is removed by movements Lua script.

There are no movements setup on this position, the only that may be related is trap.

Is there any hotfix, maybe?

Should I open a new thread for this?

gesior commented 2 years ago

There are no movements setup on this position, the only that may be related is trap.

Is there any hotfix, maybe?

Should I open a new thread for this?

#1  0x00005555556841ed in MoveEvents::onItemMove (this=0x7fffe4047c90, item=0x7fffd671ac50, tile=0x7fff74157710, isAdd=true) at /home/tfs/src/movement.cpp:365
        thing = 0x7fffce60e9d0
        tileItem = 0x7fff741576e0
        i = 1
        j = 17

@caioabranches2009 There were 17 items on that tile. It crashed on i = 1 position, so it wasn't ground. If you still got that binary file tfs and core. You can run gdb and check item ID of item at RAM address tileItem = 0x7fff741576e0. If you don't know how, message me on Discord: Gesior.pl#3208