LandSandBoat / server

:sailboat: LandSandBoat - a server emulator for Final Fantasy XI
https://landsandboat.github.io/server/
GNU General Public License v3.0
294 stars 589 forks source link

🐛 Strange getName behavior for some NPCs #4879

Closed MowFord closed 9 months ago

MowFord commented 9 months ago

I affirm:

OS / platform the server is running (if known)

Branch affected by issue

base

Steps to reproduce

simple reproducing of the cosmetic issue (notice the missing remainder of the print output for npcid 17363346):

image

but here's the similar thing happening in ShowDebug (added code to the lua getLocalVar function):

uint32 CLuaBaseEntity::getLocalVar(std::string const& var)
{
    ShowDebug(fmt::format("LocalVar: {} ({})", m_PBaseEntity->getName(), m_PBaseEntity->id).c_str());
    return m_PBaseEntity->GetLocalVar(var.c_str());
}

image

I'm guessing the issue is related to an @ symbol in the npc name and calling c_str()?

This issue report isn't just to report that weird printing issue, though. I'm unable to reproduce the original issue that spawned this rabbit hole, but I'll past the crash text:

image

image

Not sure if ShowTrace calls different underlying functions than ShowDebug and maybe that's why it crashes vs just doesn't display the rest of the text. But would appreciate some guidance here

github-actions[bot] commented 9 months ago

✨ Thanks for the report! ✨

This is a friendly automated reminder that the maintainers won't look at your report until you've properly completed all of the checkboxes in the pre-filled template.

CatsEyeXI commented 9 months ago

full crash dump here: https://drive.google.com/file/d/1LHUCvfGOASdnXaSJRNzpSz1UteSJwa20/view?usp=sharing

MowFord commented 9 months ago

That crash dump is from cexi build, but the OP screenshots showing weird print behavior is from a clean lsb/base compile

zach2good commented 9 months ago

Adding some research: For some reason, the stored name has some arbitrary bytes on the end, mostly NULL (\0), but one of whatever \b is

image

EDIT: It's loaded and completely the correct size/length during startup, so it gets mutated somewhere

EDIT2: Truncation has already happened once we're down inside Lua print:

image
zach2good commented 9 months ago

A very hands-off test case: server.lua:

xi.server.onServerStart = function()
    xi.events.handler.checkSeasonalEvents()

    local npc = GetNPCByID(17363346)
    print(npc)
    print(npc:getName())
    print(string.format('SOME STUFF - NPC: %s - OTHER STUFF HERE', npc:getName()))
end
[12/19/23 16:35:41:536][map][lua] CLuaBaseEntity(TYPE_NPC | 17363346 | @3z0 (lua_print:200)
[12/19/23 16:35:41:536][map][lua] @3z0 (lua_print:200)
[12/19/23 16:35:41:536][map][lua] SOME STUFF - NPC: @3z0 (lua_print:200)
MowFord commented 9 months ago

interesting so it's loading from the db fine, but something is mutating the NPC's name value later down the pipeline?

MowFord commented 9 months ago

Definitely something with @ in the npc name though, changed 17363346 to _3z0 and it works fine

cocosolos commented 9 months ago

So I followed 17363346 through the whole startup routine, and the name changes during luautils::OnNpcSpawn. Looking at the onSpawn for this NPC, it calls setElevator, which resizes the name. Not sure why it does that, but that seems to be the issue.

cocosolos commented 9 months ago

And for some reason insertElevator changes a character to 8, which is where that \b is coming from.

TeoTwawki commented 9 months ago

I was thinking there might have been some voodoo somewhere to ensure the name was valid for a filename because windows limits valid characters we can use, but @ is a valid character so they wouldn't have made it into an underscore back in dsp. But maybe elevators had some jank reason for what you've found and its just mistaking or assuming all the funky named NPCs get the same treatment. (doors, platforms, objects like the glowing floor to enter Ark Angels fights)

TeoTwawki commented 9 months ago

happens inside updateToEntireZone() also

cocosolos commented 9 months ago

This reminds me of the non printable named NPCs like 16793983. I'm guessing the client uses the name for some kind of data storage since they're not visible names. Is this actually causing any issues other than weird prints? If it's just that then maybe adding c_str where we're expecting a printable output, assuming none of these names are more than 7 chars long.

TeoTwawki commented 9 months ago

This jank is over 12yrs old! https://github.com/LandSandBoat/server/blame/e9a28c3b7f9e1ce0e7e05a5e4111aeecc8aed8dc/src/map/transport.cpp#L121

It predated the history of the dsp repo I think. Blame can't even show a commit hash, which is a 1st.

TeoTwawki commented 9 months ago

This reminds me of the non printable named NPCs like 16793983. I'm guessing the client uses the name for some kind of data storage since they're not visible names. Is this actually causing any issues other than weird prints? If it's just that then maybe adding c_str where we're expecting a printable output, assuming none of these names are more than 7 chars long.

if referring to like the airships where they get a binary value in the name field, we technically don't handle them perfect and its the model on a diff index than normal NPCs get theirs in (diff dat). Since they don't even really have a name, the fields aren't as expected and the quickest fix was to make the client get the data the client was expecting in our outgoing packet by sending the right bytes there. somewhere way downloads we'll want to actually check the index byte and make use of it which which kind of model it is, ignore the name field, and stuff that value in the model where it belongs. I might die of old age 1st - there's just no reason to mess with it anytime soon.

zach2good commented 9 months ago

Presumably, just before we send out the data involving these NPCs, we can make a copy of the name, do this resize and addition of the 8 data, then send it out? Then we don't mess with the underlying name?

Good sleuthing @cocosolos!

On Wed, Dec 20, 2023 at 4:51 AM TeoTwawki @.***> wrote:

This reminds me of the non printable named NPCs like 16793983. I'm guessing the client uses the name for some kind of data storage since they're not visible names. Is this actually causing any issues other than weird prints? If it's just that then maybe adding c_str where we're expecting a printable output, assuming none of these names are more than 7 chars long.

if referring to like the airships where they get a binary value in the name field, we technically don't handle them perfect and its the model on a diff index than normal NPCs get theirs in (diff dat). Since they don't even really have a name, the fields aren't as expected and the quickest fix was to make the client get the data the client was expecting in our outgoing packet by sending the right bytes there. somewhere way downloads we'll want to actually check the index byte and make use of it which which kind of model it is, ignore the name field, and stuff that value in the model where it belongs. I might die of old age 1st - there's just no reason to mess with it anytime soon.

— Reply to this email directly, view it on GitHub https://github.com/LandSandBoat/server/issues/4879#issuecomment-1863841065, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKTJIM3T6DXNIU5HRYKYCDYKJVD5AVCNFSM6AAAAABA3MALYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRTHA2DCMBWGU . You are receiving this because you commented.Message ID: @.***>

zach2good commented 9 months ago

Do we have a way of telling if an entity has been set as an elevator? without asking the zone's elevators struct? In these cases we could look for the presence of "@" in the name, unless there are more cases where this happens? Since this is on NPCs I'm not opposed to a m_isElevator member in CNPCEntity

CatsEyeXI commented 9 months ago

This reminds me of the non printable named NPCs like 16793983. I'm guessing the client uses the name for some kind of data storage since they're not visible names. Is this actually causing any issues other than weird prints? If it's just that then maybe adding c_str where we're expecting a printable output, assuming none of these names are more than 7 chars long.

We ended up here due to repeat crashes, dump is attached further up. Whether or not this was the actual reason for the crash, or a red herring, I cant say for sure.

zach2good commented 9 months ago

Found the exact cause of the crash:

#include <fmt/core.h>
#include <fmt/args.h>
#include <fmt/printf.h>

#include <iostream>
#include <string>
#include <vector>

int main()
{
    // The ShowX loggers are wrappers around fmt::sprintf
    // ShowTrace(fmt::format("CZoneEntities::ZoneServer: NPC: {} ({})", PNpc->getName(), PNpc->id).c_str());

    // PNpc->getName() returns "@3z0%JP)\b"
    //                              ^ -- NOTE THE % CHARACTER

    // This will crash, becasue printf sees the % in the string and tries to look up an argument for it.
    // But there isn't one!

    // Equivalent to this (will crash):
    // fmt::printf("CZoneEntities::ZoneServer: NPC: @3z0%JP)\b");

    // Removing the % stops the crash
    //fmt::printf("CZoneEntities::ZoneServer: NPC: @3z0JP)\b");

    // Usint fmt::format/fmt::print stops the % causing a crash
    fmt::print("CZoneEntities::ZoneServer: NPC: @3z0%JP)\b");

    return 0;
}

The additional string memory isn't initialised to anything during the resize so it could be garbage. If one of those new garbage characters is '%' then it'll cause our old-style logging to crash since it's printf-based. Using fmt::fmt will stop these crashes happening.

If we git rid of this resize on the name member of the NPC the crash will go away.

Swapping out printf-style core logging for fmt-based can happen another time, but at least now we know about this sneaky foot-gun

zach2good commented 9 months ago

A quick one-liner to fix this without having to refactor anything would be zero-out all that new string memory and then copy the original name into it, then there wont be any garbage data in there

CatsEyeXI commented 9 months ago

appreciate you zach! (and everyone)

zach2good commented 9 months ago

Maybe make those changes in here and break out the case for elevators?

entity_update.cpp


    // Set look data
    switch (PEntity->look.size)
    {
        case MODEL_STANDARD:
        case MODEL_UNK_5:
        case MODEL_AUTOMATON:
        {
            ref<uint32>(0x30) = ::ref<uint32>(&PEntity->look, 0);
        }
        break;
        case MODEL_EQUIPPED:
        case MODEL_CHOCOBO:
        {
            this->setSize(0x48);
            std::memcpy(data + 0x30, &PEntity->look, sizeof(look_t));
        }
        break;
        case MODEL_DOOR:
        case MODEL_ELEVATOR:
        case MODEL_SHIP:
        {
            this->setSize(0x48);
            ref<uint16>(0x30) = PEntity->look.size;
            std::memcpy(data + 0x34, PEntity->getName().c_str(), (PEntity->getName().size() > 12 ? 12 : PEntity->getName().size()));
        }
        break;
    }
zach2good commented 9 months ago

Definitely something with @ in the npc name though, changed 17363346 to _3z0 and it works fine

@MowFord Unfortunately, this was just the entropy of the universe not serving you a % in the garbage data that was generated for this NPC. Or it didn't get marked as an elevator, so it turned off the bit of code that generates the garbage 🤷

TeoTwawki commented 9 months ago

Do we have a way of telling if an entity has been set as an elevator?

model ID

0x0300000000000000000000000000000000000000 = elevator platform 0x0200000000000000000000000000000000000000 = doors, including invisible elevator doors. also any moving parts such as winches, pulleys, levers, the drawbridge in bastok... 0x0400000000000000000000000000000000000000 = airships, boats, etc.

Its that single byte in front. the rest gets unused because the client is pulling from bytes we have the name field at instead, which was why we have to name airships the way we do in the database.

(and 0x00 is mob/npc which only uses face/race bytes, and 0x01 is player type models which uses all the bytes)

zach2good commented 9 months ago

I'm quite busy today, I'll test this later:

zach2good commented 9 months ago
diff --git a/src/map/lua/lua_baseentity.cpp b/src/map/lua/lua_baseentity.cpp
index 1e5d10cc59..ee421fd3fc 100644
--- a/src/map/lua/lua_baseentity.cpp
+++ b/src/map/lua/lua_baseentity.cpp
@@ -2088,7 +2088,6 @@ void CLuaBaseEntity::setElevator(uint8 id, uint32 lowerDoor, uint32 upperDoor, u

     elevator.zoneID = m_PBaseEntity->loc.zone->GetID();

-    elevator.Elevator->name.resize(10);
     CTransportHandler::getInstance()->insertElevator(elevator);
 }

diff --git a/src/map/packets/entity_update.cpp b/src/map/packets/entity_update.cpp
index 38256e6183..85812d8809 100644
--- a/src/map/packets/entity_update.cpp
+++ b/src/map/packets/entity_update.cpp
@@ -222,7 +222,6 @@ void CEntityUpdatePacket::updateWith(CBaseEntity* PEntity, ENTITYUPDATE type, ui
         }
         break;
         case MODEL_DOOR:
-        case MODEL_ELEVATOR:
         case MODEL_SHIP:
         {
             this->setSize(0x48);
@@ -230,6 +229,19 @@ void CEntityUpdatePacket::updateWith(CBaseEntity* PEntity, ENTITYUPDATE type, ui
             std::memcpy(data + 0x34, PEntity->getName().c_str(), (PEntity->getName().size() > 12 ? 12 : PEntity->getName().size()));
         }
         break;
+        case MODEL_ELEVATOR:
+        {
+            this->setSize(0x48);
+            ref<uint16>(0x30) = PEntity->look.size;
+
+            // Elevator name is 10 bytes long, with the name copied into the start, with the 8th byte being 0x08
+            std::string str(10, '\0');
+            std::memcpy(str.data(), PEntity->getName().c_str(), PEntity->getName().size());
+            str[8] = 8;
+
+            std::memcpy(data + 0x34, str.c_str(), str.size());
+        }
+        break;
     }

     // Slightly bigger packet to encompass both name and model on first spawn, and only for dynamic entities.
diff --git a/src/map/transport.cpp b/src/map/transport.cpp
index 079b99d16d..f4b4f04ee0 100644
--- a/src/map/transport.cpp
+++ b/src/map/transport.cpp
@@ -453,7 +453,6 @@ void CTransportHandler::insertElevator(Elevator_t elevator)
     // Have permanent elevators wait until their next cycle to begin moving
     uint32 VanaTime            = CVanaTime::getInstance()->getDate();
     elevator.lastTrigger       = VanaTime - (VanaTime % elevator.interval) + elevator.interval;
-    elevator.Elevator->name[8] = 8;

     // Initialize the elevator into the correct state based on
     // its animation value in the database.
zach2good commented 9 months ago

There's also this section which shows that suspicious behaviour:

void CLuaBaseEntity::updateToEntireZone(uint8 statusID, uint8 animation, sol::object const& matchTime)
{
    if (m_PBaseEntity->objtype != TYPE_NPC)
    {
        ShowWarning("Attempting to update NPC to entire zone with invalid entity type (%s).", m_PBaseEntity->getName());
        return;
    }

    auto* PNpc          = static_cast<CNpcEntity*>(m_PBaseEntity);
    bool  updateForTime = (matchTime != sol::lua_nil) ? matchTime.as<bool>() : false;

    PNpc->status    = static_cast<STATUS_TYPE>(statusID);
    PNpc->animation = animation;

    // If this flag is high, update the NPC's name to match the current time
    if (updateForTime == true)
    {
        PNpc->name.resize(10);
        ref<uint32>(&PNpc->name[0], 4) = CVanaTime::getInstance()->getVanaTime();
        PNpc->name[8]                  = 8;
    }

    PNpc->loc.zone->UpdateEntityPacket(PNpc, ENTITY_UPDATE, UPDATE_COMBAT, true);
}