DFHack / dfhack

Memory hacking library for Dwarf Fortress and a set of tools that use it
Other
1.86k stars 468 forks source link

digv can sometimes result in "Inappropriate dig square" cancellation #1229

Closed lethosor closed 6 years ago

lethosor commented 6 years ago

From #481's fix Some tiles are dug without flashing, and then the neighboring tiles produce this announcement. Note that this does not actually remove the designation, and usually the tile will get mined shortly after (maybe it's double-designated somehow)? See also #1228

AncalagonX commented 6 years ago

I have noticed this in 0.44.10 as well, at various times in different places of my fortress mining designations. I definitely was using digv and setting mining priorities.

As my fortress was in the final stages of FPS death, I was also very likely using fastdwarf 1 1, which causes dwarfs to both teleport and complete tasks more quickly. Maybe that is another element of this issue. It definitely occurs without fastdwarf enabled. On a first-year fort, as soon as I used digv on a large patch of Olivine that already had some designations on it, I started getting these announcements.

It appears related to setting a digging priority, then using digv with a different priority to overwrite the original dig priority. They might both exist at the same time somehow. This appears to be what is happening in my fortress—it's my best guess right now.

AncalagonX commented 6 years ago

This definitely occurred 15 minutes ago in my fortress at a location where I had priority 5 and priority 7 tiles waiting to be mined, immediately after I used digv (not digl, but see next paragraph) with "priority 4" on a gigantic oval of olivine that overlapped those pre-existing designationss. I don't know if priorities are relevant or not--perhaps not?

And on a previous fortress (either 0.44.09 or 0.44.10), I was using digl to mine Marble. I noticed when I manually [d] [x] un-designated a large section of mining that was spamming these announcements, several of the orange-ish "designated to be mined" tiles disappeared completely--they had apparently already been mined! This occurred several times over the course of that fortress' life. I seem to recall I may have performed multiple digl with differing priority levels on that, and I likely also added in some manual [d] rectangle designations on top of the digl auto-designations to grab the un-designated gems embedded within the marble here and there. It was a week or two ago, so I don't remember the details of that incident perfectly.

Hope this is helpful.

lethosor commented 6 years ago

Looking at the map block, there's only one block_square_event_designation_priorityst in block_events, so priorities aren't showing up twice. However, when the issue occurs, some values in there have changed - e.g. the problematic square has a 0 priority (instead of 1000-7000 as set), and some others have 4000 for some reason, possibly tiles I undesignated earlier. I'm really not sure what's going on here. @BenLubar, is there a chance this has something to do with new designation jobs (e.g. herbalism that you dealt with in df-ai)?

Edit: the priorities didn't change - I was reading them incorrectly (block_events[i].priority[x] is a column, not a row).

BenLubar commented 6 years ago

Here's what the AI does: https://github.com/BenLubar/df-ai/blob/c82041aeafcfd52b5089ae8b79d1d7e0985a439d/plan.cpp#L1894-L1922

I'm not sure if it causes cancellation spam, since I usually run the AI with cancellation messages disabled.

The herbalism is done via gathering zones.

suokko commented 6 years ago

I looked yesterday into this bug but I forgot to update my findings before going to sleep. I haven't figured out how to fix this bug but I think I have fairly good idea why it happens.

I used gdb watch -l command to setup memory write break points to help me isolate what df code does when designing mining.

map_block->designation.dig is only used for a short period of time. The value is reseted back to zero when the mining job is generated. I don't yet know where the value is copied because the machine code in that area is horrible. There is about 20 useless instructions to a useful instruction which makes reading the machine code harder than well optimized code. I also noticed that df writes the dig value as a single memory write but erasing the designation happens bit by bit.

If df code sets a new designation there is following steps that I at least so far know about

  1. Set map_block->flags.designated to 1
  2. Reset the existing value from map_block->designation.dig to zero
  3. Set the new value to map_block->designation.dig
  4. Reset map_block->occupancy.dig_marked and .dig_auto to zero (but they appear never to be set but they are still unconditionally cleared)
  5. Set the priority to the event structure (the priority is never reset after the job but if there is a new designation the old value is overwritten)
  6. Unknown check for the existing job -> if no job queued yet then processing seems to stop
  7. Unknown process to copy the map_block->designation to unknown data structure which holds actual mining designations. But this process results to resetting map_block->designation.dig to zero.

I'm not exactly user about order of 5 and 6 because I haven't seen them after same UI action. The limit to only 4 hardware memory break points is heavily limiting the ability to watch what happens around this area. I guess I should at last learn how to use valgrind gdb integration to set memory range break points (valgrind provides unlimited memory breakpoints). But before I try valgrind I plan to setup break point to the begin of the process and try to single step it. That at least helps me figure out which branch can be ignored . Step 7 seems to trigger only after df time has been advanced. When I did the testing with a single designation at the time it was enough to advance a time step with dot key. At that point no miner has yet taken the mining job but I assume there is some job structure already created to notify miners that they can mine the tile.

Rendering code appears to use both map_block->designation.dig and unknown shadow data structure to figure out where to paint designations. The rendering/locig code appears to read the map_block->designation.dig from multiple different code locations per frame which makes r&w memory break points hard to use effectively.

lethosor commented 6 years ago

Is it possible that the thing in (7) is world.jobs.job_list or world.jobs.job_postings? The Designations module handles tree designations by searching job_list.

dig_auto and dig_marked are probably the a and m options in the designation menu, respectively. Changing those should change the flags set from 0 to something else.

suokko commented 6 years ago

I suspect it probably is world.jobs. It feels a bit weird rendering code has to look into there instead of just map_block for all rendering decisions.

I will continue tomorrow trying to figure out where the designation is copied.

lethosor commented 6 years ago

Fixed in #1313

AncalagonX commented 6 years ago

@suokko and @lethosor: You guys are wizards. Countless people benefit from the hard work you put into DFHack. Thanks for all you do.