DFHack / dfhack

Memory hacking library for Dwarf Fortress and a set of tools that use it
Other
1.86k stars 471 forks source link

Quickfort created zones not working and can cause DF to crash. #1758

Closed cjhammel closed 3 years ago

cjhammel commented 3 years ago

I have been doing a DreamFort run through seeing how the features work in a play mode. I have found that zones created are not working. The animal pen on the surface the dwarfs will not move the dogs to the pen on the stairs nor the zone for the grazing animals. In fact these zones were causing the game to crash very quickly within a few game days after I added the animals. When I deleted the zone and added them via manually the crashing went away and animals were properly penned.

Dwarfs would not add the crutches, splints, bandages, thread to hospital zone on the services level. I deleted the zone add it manually then everything started getting added.

myk002 commented 3 years ago

There is one strange thing DFHack does when it creates zones that might be relevant here. Buildings.cpp's linkRooms() makes all zones parents of themselves, leading to an infinite loop. It does not cause crashes for me, but it may be a platform thing. The fix is one line - check if bld == room, but I haven't investigated whether the current behavior is correct for non-zones yet.

myk002 commented 3 years ago

ok, fairly good evidence that this is the problem. ldog posted a savegame that has a reproducible crash when the merchants unload. stack trace shows overflow due to infinite recursion.

removing all zones avoids the crash.

let me investigate how linkRooms() is supposed to work (extent-based buildings like farm plots? overlapping room definitions?) and I'll see if I can come up with a proper fix.

myk002 commented 3 years ago

Ok, I think I see what's going on here, and why this is only a problem with zones.

linkRooms() is intended to link objects that can be part of a room -- like furniture -- to the building that the room is defined from. It only links parents to children if is_room is set on the parent and the child falls within the parent's extent.

linkRooms() is only called from linkBuilding(), which is only called when a building is being constructed by constructAbstract(), constructWithItems(), or constructWithFilters(). In order for is_room to be set when linkRooms() is called it must be manually set by the caller between the call to allocInstance() and the call to one of the construct*() functions.

Quickfort is the only caller to do this, and it only does this for zones (which must have is_room=true). If any caller were to try to construct a building with a room already defined from it, it would run into the same issue.

I think the fix I proposed in my first comment is correct. We just need to avoid adding ourselves to our own child list to allow this function to work as intended.

lethosor commented 3 years ago

I came up with a rudimentary script to repair the save linked above:

for i, bld in ipairs(df.building.get_vector()) do
    for j = #bld.children - 1, 0, -1 do
        if bld.children[j] == bld then
            print('removing self-child', i, j, bld)
            bld.children:erase(j)
        end
    end
    for j = #bld.parents - 1, 0, -1 do
        if bld.parents[j] == bld then
            print('removing self-parent', i, j, bld)
            bld.parents:erase(j)
        end
    end
end

Unsure if this warrants a fix/ or devel/ script - depends on how widespread the issue is. (Heck, we still have a C++ fix-job-postings command to deal with #741... so there's a precedent for it, I guess)