LandSandBoat / server

:sailboat: LandSandBoat - a server emulator for Final Fantasy XI
https://landsandboat.github.io/server/
GNU General Public License v3.0
293 stars 579 forks source link

Crashes resulting from Trust Magic & Enmity / Notoriety #1849

Closed CatsEyeXI closed 2 years ago

CatsEyeXI commented 2 years ago

Branch affected by issue

base

I think that the existing issue regarding the soft crashes should be limited to the server not terminating properly, and that I should start a new issue for the series of trust crashes we've been having.

image image image

CatsEyeXI commented 2 years ago

image

CatsEyeXI commented 2 years ago

Hey there, if this has anyone's attention it would be comforting to know . We're really going through it at the moment, extremely stressful 😓 Opened 3 days ago and not a peep from anyone.

zach2good commented 2 years ago

I'm on holiday, everyone else is doing their own thing. We'll get to it when we get to it.

CatsEyeXI commented 2 years ago

I'm on holiday, everyone else is doing their own thing. We'll get to it when we get to it.

thank you zach! enjoy your holiday!

CatsEyeXI commented 2 years ago

image

    [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::_Find_lower_bound(CBattleEntity * const &) Line 1592 C++
    [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::_Find(CBattleEntity * const &) Line 1346 C++
    [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::find(CBattleEntity * const &) Line 1356  C++
>   xi_map.exe!CNotorietyContainer::remove(CBattleEntity * entity) Line 58  C++
    xi_map.exe!CEnmityContainer::Clear(unsigned int EntityID) Line 72   C++
    xi_map.exe!CEnmityContainer::~CEnmityContainer() Line 50    C++
    xi_map.exe!CMobEntity::~CMobEntity() Line 155   C++
    [External Code] 
    xi_map.exe!CZoneEntities::ZoneServer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, bool check_regions) Line 1486   C++
    xi_map.exe!CZone::ZoneServer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, bool check_regions) Line 871    C++
    xi_map.exe!zone_server_region(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, CTaskMgr::CTask * PTask) Line 107  C++
    xi_map.exe!CTaskMgr::DoTimer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick) Line 122    C++
    xi_map.exe!main(int argc, char * * argv) Line 278   C++
CatsEyeXI commented 2 years ago

image

>   [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::_Find_lower_bound(CBattleEntity * const &) Line 1592 C++
    [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::_Find(CBattleEntity * const &) Line 1346 C++
    [Inline Frame] xi_map.exe!std::_Tree<std::_Tset_traits<CBattleEntity *,std::less<CBattleEntity *>,std::allocator<CBattleEntity *>,0>>::find(CBattleEntity * const &) Line 1356  C++
    xi_map.exe!CNotorietyContainer::remove(CBattleEntity * entity) Line 58  C++
    xi_map.exe!CEnmityContainer::Clear(unsigned int EntityID) Line 72   C++
    xi_map.exe!CEnmityContainer::~CEnmityContainer() Line 50    C++
    xi_map.exe!CMobEntity::~CMobEntity() Line 155   C++
    [External Code] 
    xi_map.exe!CZoneEntities::ZoneServer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, bool check_regions) Line 1485   C++
    xi_map.exe!CZone::ZoneServer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, bool check_regions) Line 871    C++
    xi_map.exe!zone_server_region(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick, CTaskMgr::CTask * PTask) Line 110  C++
    xi_map.exe!CTaskMgr::DoTimer(std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::ratio<1,10000000>>> tick) Line 122    C++
    xi_map.exe!main(int argc, char * * argv) Line 278   C++
    [External Code] 
CatsEyeXI commented 2 years ago

Hmm they look exactly the same

RAIST5150 commented 2 years ago

Seems a pattern is forming around Shantotto II on xiweb? A couple accounts of instability cropping up with players getting 5th trust and started adding her in the mix.

Koru doesn't seem to play nice with SMN either. That has been limited to client crash though and not server.

Not sure if it is related, but I had the client crash when trying to use Diabolos magic blood pacts too (physical are OK). His animations are all mixed up though (he spazzes out during blood pact animations) so may not be related.

CatsEyeXI commented 2 years ago

Seems a pattern is forming around Shantotto II on xiweb? A couple accounts of instability cropping up with players getting 5th trust and started adding her in the mix.

Koru doesn't seem to play nice with SMN either. That has been limited to client crash though and not server.

Not sure if it is related, but I had the client crash when trying to use Diabolos magic blood pacts too (physical are OK). His animations are all mixed up though (he spazzes out during blood pact animations) so may not be related.

have you saved any of the call stacks?

zach2good commented 2 years ago

While not the fix itself, this work by Winter: https://github.com/LandSandBoat/server/pull/1954

Is the stepping stone to finding the root-cause and fixing the issue, rather than slapping a quick null-check on something and pushing the crash into a different area of code

WinterSolstice8 commented 2 years ago

Cannot produce with the aforementioned branch, Valgrind doesn't even complain. I've tried the various methods I've seen around:

I know it's a lot to ask, but I think we need a more reliable method to produce this without us bashing our heads against the wall. Maybe there's just something fundamentally different here that we don't have. Any hints would be nice too, especially if one method appears more reliable than others.

CatsEyeXI commented 2 years ago

Cannot produce with the aforementioned branch, Valgrind doesn't even complain. I've tried the various methods I've seen around:

* 2nd account that's not the leader homepoint with Apururu up

* Summon apururu, attack bat, unsummon apururu, resummon apururu, attack bat

* generally just use Shantotto II

I know it's a lot to ask, but I think we need a more reliable method to produce this without us bashing our heads against the wall. Maybe there's just something fundamentally different here that we don't have. Any hints would be nice too, especially if one method appears more reliable than others.

I've tried to include any and all testimonials from my players along with the call stacks. I've not personally encountered this myself (I am left with very little time to actually play). May I suggest you join our discord so you can view #server-crash-report? My players are requested to immediately report what specifically they were doing following a crash, we have them about twice a day. Maybe if we team up on this we can cover more ground and get you the information you're looking for. I'm happy to do whatever necessary to get to the bottom of this.

Upon a crash, I can call out the player associated and we can interrogate them as needed?

TeoTwawki commented 2 years ago

Player testimony very seldom gets actionable info. most people don't realize that what they saw on their screen was not what crashed things, because by the time you see something it already happened server side so what they saw was before the crashing. A better pattern is to find the entity in the crashdump, and if it's a player can talk to them sure, but the function called at time of crash is almost always going to give you a clearer picture than what the player knows. I don't mean to say it's useless, just to take it with a grain of salt and know that chances are they can't tell you much that helps in most cases.

CatsEyeXI commented 2 years ago

Player testimony very seldom gets actionable info. most people don't realize that what they saw on their screen was not what crashed things, because by the time you see something it already happened server side so what they saw was before the crashing. A better pattern is to find the entity in the crashdump, and if it's a player can talk to them sure, but the function called at time of crash is almost always going to give you a clearer picture than what the player knows. I don't mean to say it's useless, just to take it with a grain of salt and know that chances are they can't tell you much that helps in most cases.

yeah, I totally get that, I'm just not sure what additional information I can offer. I don't understand the inner workings as well as you guys do, so may there's an opportunity to ask the 'right' questions and dig deeper with the actual culprit than I know how to.

CatsEyeXI commented 2 years ago

This one looks a little different... Not sure if its the same issue though... image

CatsEyeXI commented 2 years ago

These crashes seem to have been replaced with the isDead() is Alive() crashes (as documented in another issue), so closing this one.