mattgodbolt / xania

Xania MUD source
Other
58 stars 13 forks source link

Crash seen in prod in affect_modify #218

Closed mattgodbolt closed 3 years ago

mattgodbolt commented 4 years ago

Stack trace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00005571e81580f9 in affect_modify(Char*, AFFECT_DATA const&, bool) [clone .part.0] ()
(gdb) bt
#0  0x00005571e81580f9 in affect_modify(Char*, AFFECT_DATA const&, bool) [clone .part.0] ()
#1  0x00005571e81582f1 in affect_remove_obj(OBJ_DATA*, AFFECT_DATA const&) ()
#2  0x00005571e819c4ab in obj_update() ()
#3  0x00005571e819c7c1 in update_handler() ()
#4  0x00005571e8135801 in game_loop_unix(Fd) ()
#5  0x00005571e812c311 in main ()

One character online but wasn't doing anything at the time (CC @pierrewery )

mattgodbolt commented 4 years ago
=> 0x00005571e81580f9 <+41>:    cmpw   $0x10,0x100(%rbx)
(gdb) p $rbx
$1 = 7070758576040190017

707075... is not a pointer. it's garbage with a bit of ASCII-looking stuff: "ppu\x85v\x04..."

rbx is the 0x70th offset pointer (14th 8b offset) in the rdi (Char ),which is the AFFECT_DATA first_ in the AffectList. That makes me think the Char* has been deleted and recycled already by this point.

mattgodbolt commented 4 years ago

Further debugging hampered by idiot me not including debug symbols. 22d4650767d5800f8674bb3134f5c5629c32da06 rectifies

mattgodbolt commented 4 years ago

argh

(pdb) p /x $rbx 
0x62206176616c2041

which is ascii though not anything recognisable.

>>> "".join([chr(int(x, 16)) for x in re.findall('..', s)])
'b aval A'
mattgodbolt commented 4 years ago

of course, little endian, it's backwards:

(gdb) p (char *)($rdi + 0x70)
$6 = 0x5571e8cdb280 "A lava beast spews from place to place.\n\r"
mattgodbolt commented 4 years ago

so: the actual address where we'd expect the pointer of the first elem of the char's AffectList is not a pointer: it's an actual ASCII string. Pretty compelling evidence the Char has gone and has been replaced by something else now, but some object has a pointer to it still as a carried_by

mattgodbolt commented 4 years ago

Other junk around the Char: "(gdb) p (char*)($rdi + 48) $15 = 0x5571e8cdb240 "A little imp runs along being annoying.\n\r" "

mattgodbolt commented 4 years ago

@pierrewery do you remember what area you were in? I wonder if the mobs you were killing have decaying objects?

mattgodbolt commented 4 years ago

The character was in Mega City. Will check that out when I have time.

mattgodbolt commented 4 years ago

"In case it is any more help, I was in the Mahmount Allesh Megway room for yesterday's crash."

mattgodbolt commented 3 years ago

Happened again with same char - two big fights then an afk in an area.

654 /home/runner/work/xania/xania/src/handler.cpp: No such file or directory.
(gdb) bt
#0  get_eq_char (iWear=16, ch=ch@entry=0x5617195c6ef0) at /home/runner/work/xania/xania/src/handler.cpp:654
#1  affect_modify (ch=ch@entry=0x5617195c6ef0, fAdd=false, af=...) at /home/runner/work/xania/xania/src/handler.cpp:383
#2  0x0000561718151921 in affect_modify (fAdd=false, af=..., ch=0x5617195c6ef0) at /home/runner/work/xania/xania/src/Char.hpp:128
#3  affect_remove_obj (obj=0x7f2d8dd00be0 <main_arena+96>, af=...) at /home/runner/work/xania/xania/src/handler.cpp:429
#4  0x000056171819754b in <lambda(auto:31&)>::operator()<AFFECT_DATA> (__closure=<optimized out>, __closure=<optimized out>, af=...) at /usr/include/c++/9/bits/hashtable.h:748
#5  AffectList::modification_safe_for_each<obj_update()::<lambda(auto:31&)> > (this=0x7f2d8dd00c20 <main_arena+160>, func=...) at /home/runner/work/xania/xania/src/AffectList.hpp:38
#6  obj_update () at /home/runner/work/xania/xania/src/update.cpp:668
#7  0x0000561718197861 in update_handler () at /home/runner/work/xania/xania/src/update.cpp:908
mattgodbolt commented 3 years ago
     The corpse of the Draconian is lying here.
     The splattered brains of the Draconian are lying here.
     The corpse of a bodyguard is lying here.

was on the ground when the mud died

mattgodbolt commented 3 years ago

he'd killed the 2 guys in 2207, then went down and slept and afk'd he also autolooted he had also done 'create food', collected the mushroom and eaten it a minute or two before afking

mattgodbolt commented 3 years ago

the "remove set" has '65' in it

(gdb) info locals
removed_this_tick_with_msg = std::unordered_set with 1 element = {[0] = 65}

and

(gdb) p af
$5 = (const AFFECT_DATA &) @0x56171958cd60: {next = 0x40, type = 65, level = 0, duration = 0, location = 425166288, modifier = 22039, bitvector = 2379222032}

the affect at frame #2 also has a type of 65, and a bogus-looking next. This seems like a totally bogus AFFECT_DATA on a totally bogus object.

mattgodbolt commented 3 years ago

Hypotehsis: it's a special case of #182 : in update.cpp

    for (obj = object_list; obj != nullptr; obj = obj_next) {

standard "cope with obj being deleted` BUT

if (obj->item_type == ITEM_CORPSE_PC && obj->contains) { /* save the contents */

if we delete a corpse....and it has items....

                if (obj->in_obj) /* in another object */
                    obj_to_obj(t_obj, obj->in_obj);

                if (obj->carried_by) /* carried */
                    obj_to_char(t_obj, obj->carried_by);

                if (obj->in_room == nullptr) /* destroy it */
                    extract_obj(t_obj);

all manner of other manipulations to the obj_list IFFF one of those is obj_next BOOM I think it will probably rely on the exact order of the global list though corpse->item on corpse that should be destroyed

mattgodbolt commented 3 years ago

Ah criticaly this is only for ITEM_CORPSE_PC :|

mattgodbolt commented 3 years ago

But...if a corpse has an item that itself contains things, during extract_obj we recursively extract its contents, which has the same effect (that is, it could delete something that obj_next is pointing at.

mattgodbolt commented 3 years ago

Can't see how this is happening though: tried using create food mushrooms that expired, springs that expired, the draconian guard picked up the mushroom I noticed... argh

mattgodbolt commented 3 years ago

Reprod! Repro case:

Screenshot from 2020-09-21 18-39-47

mattgodbolt commented 3 years ago

no need to get the coins

mattgodbolt commented 3 years ago

oh....that's it. so simple.

Ok! So it is #182 after all! yay!

mattgodbolt commented 3 years ago

Confirmed the WIP cf87a7c does indeed fix this issue.