DFHack / dfhack

Memory hacking library for Dwarf Fortress and a set of tools that use it
Other
1.86k stars 468 forks source link

dwarfvet: Occasional crash upon discharging patient from hospital zone #1711

Closed Nilsolm closed 3 years ago

Nilsolm commented 3 years ago

I noticed that dwarfvet is a bit crashy currently so I've been trying to track down the possible causes. I got one of them which sometimes leads to a crash when an animal is discharged from the hospital after having been treated:

(gdb) bt
#0  0x00007fffd37f6d4a in std::_Bit_reference::operator=(bool) (__x=false, this=<optimized out>)
    at /usr/include/c++/9/bits/stl_bvector.h:237
#1  AnimalHospital::dischargePatient(Patient*, DFHack::color_ostream&)
    (this=this@entry=0x7fffce2f7da0, patient=<optimized out>, out=...) at ../plugins/dwarfvet.cpp:350
#2  0x00007fffd37f7636 in AnimalHospital::processPatients(DFHack::color_ostream&) (this=0x7fffce2f7da0, out=...)
    at ../plugins/dwarfvet.cpp:388
#3  0x00007fffd37f84bb in tickHandler(DFHack::color_ostream&, void*) (out=..., data=data@entry=0x76c)
    at ../plugins/dwarfvet.cpp:752
#4  0x00007ffff7bcd702 in manageTickEvent(DFHack::color_ostream&) (out=...) at ../library/modules/EventManager.cpp:351
#5  0x00007ffff7bcac11 in DFHack::EventManager::manageEvents(DFHack::color_ostream&) (out=...)
    at ../library/modules/EventManager.cpp:336
#6  0x00007ffff79370a8 in DFHack::Core::onUpdate(DFHack::color_ostream&) (this=this@entry=
    0x7ffff7f5aae0 <DFHack::Core::getInstance()::instance>, out=...) at ../library/Core.cpp:2118
#7  0x00007ffff794646d in DFHack::Core::doUpdate(DFHack::color_ostream&, bool)
    (this=0x7ffff7f5aae0 <DFHack::Core::getInstance()::instance>, out=..., first_update=<optimized out>)
    at ../library/Core.cpp:2069
#8  0x00007ffff79497ac in DFHack::Core::Update() (this=0x7ffff7f5aae0 <DFHack::Core::getInstance()::instance>)
    at ../library/Core.cpp:2100
#9  0x00007ffff6e0c202 in enablerst::async_loop() () at /home/david/Games/dev/df_linux/libs/libgraphics.so
#10 0x00007ffff6e0c4e0 in call_loop(void*) () at /home/david/Games/dev/df_linux/libs/libgraphics.so
#11 0x00007ffff741cf3c in  () at /lib/x86_64-linux-gnu/libSDL-1.2.so.0
#12 0x00007ffff745cbaf in  () at /lib/x86_64-linux-gnu/libSDL-1.2.so.0
#13 0x00007ffff65ef609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x00007ffff67ab293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

DFHack version: 0.47.04-r3 (development build 0.47.04-r3-151-gddbd22fc) on x86_64

lethosor commented 3 years ago

Thanks - is it reproducible?

A change I often make to get better debugging symbols for individual plugins is:

--- a/plugins/CMakeLists.txt
+++ b/plugins/CMakeLists.txt
@@ -113,7 +113,7 @@ if(BUILD_SUPPORTED)
     dfhack_plugin(dig dig.cpp)
     dfhack_plugin(digFlood digFlood.cpp)
     add_subdirectory(diggingInvaders)
-    dfhack_plugin(dwarfvet dwarfvet.cpp)
+    dfhack_plugin(dwarfvet dwarfvet.cpp COMPILE_FLAGS_GCC "-O0 -g")
     dfhack_plugin(dwarfmonitor dwarfmonitor.cpp LINK_LIBRARIES lua)
     add_subdirectory(embark-assistant)
     dfhack_plugin(embark-tools embark-tools.cpp)
Nilsolm commented 3 years ago

I couldn't find a way to reliably reproduce it, unfortunately. It doesn't happen every time dischargePatient is called.

I'll dig a bit deeper and see if I can find out more.

quietust commented 3 years ago

Taking a brief look at dwarfvet, I noticed that the Patient structure's spot_index field is uninitialized - it's provided to the constructor, but it's never actually saved. This is almost definitely what's responsible for the crash, which happened at the line spots_in_use[(*accepted_patient)->getSpotIndex()] = false;.

The following patch might work, though I don't have the means to test it right now:

diff --git a/plugins/dwarfvet.cpp b/plugins/dwarfvet.cpp
index 6826f38..604a229 100644
--- a/plugins/dwarfvet.cpp
+++ b/plugins/dwarfvet.cpp
@@ -72,21 +72,22 @@ struct hospital_spot {
 class Patient {
   public:
     // Constructor/Deconstrctor
-    Patient(int32_t id, int spot_index, int32_t x, int32_t y, int32_t z);
+    Patient(int32_t id, size_t spot_index, int32_t x, int32_t y, int32_t z);
     int32_t getID() { return this->id; };
-    int32_t getSpotIndex() { return this->spot_index; };
+    size_t getSpotIndex() { return this->spot_index; };
     int32_t returnX() { return this->spot_in_hospital.x; };
     int32_t returnY() { return this->spot_in_hospital.y; };
     int32_t returnZ() { return this->spot_in_hospital.z; };

   private:
     struct hospital_spot spot_in_hospital;
-    int id;
-    int spot_index;
+    int32_t id;
+    size_t spot_index;
 };

-Patient::Patient(int32_t id, int32_t spot_index, int32_t x, int32_t y, int32_t z){
+Patient::Patient(int32_t id, size_t spot_index, int32_t x, int32_t y, int32_t z){
     this->id = id;
+    this->spot_index = spot_index;
     this->spot_in_hospital.x = x;
     this->spot_in_hospital.y = y;
     this->spot_in_hospital.z = z;

(I also cleaned up a few types that were either mismatched or inappropriate)