Closed Infernio closed 2 years ago
So, further analysis revealed the following:
I recreated the single edit that the minimized crash-inducing bashed patch makes, using TES4Edit. There was a single byte length difference, stemming from the NIFZ entry. The TES4Edit version had an extra null byte there.
Looking at MelStrings.packSub(), the function documentation says that it will add null terminators to every string in the array, but the actual program logic uses null1.join()
, which means null terminators are only inserted in between elements, but not at the end.
That's exactly the one null byte difference between the bash/tes4edit variants.
Since a second null byte (from some other source) follows, this happens to work in most cases, but it probably leads to an out-of-bounds read at some unpredictable moment later on. I presume the specific circumstances for the crash listed above just happen to create a setting where that results in a crash.
That should be handled by https://github.com/wrye-bash/wrye-bash/blob/0277bf2f5293a48304384c2eaa5316849dd60c4f/Mopy/bash/brec/basic_elements.py#L529-L530
That should call Subrecord.packSub
, which then calls MelString._dump_bytes
, which adds the final null terminator. It's possible that broke during refactoring though (@Utumno?)
I see, but TES4Edit places two null bytes at the end of that list. Maybe oblivion needs that?
That's possible. Not sure if that applies to all MelStrings
or just to this one (e.g. KFFZ
, the animations subrecord).
I had a second similar crash caused by the "Tweak Names" collection (several of them). That went away with this fix, too.
Can you try this build: https://github.com/wrye-bash/wrye-bash/suites/5372587249/artifacts/168603737 and report back if that fixes it?
Yes, this build works for the two test cases I have, the initial CTD as described above, and a second setting in a heavily modded game. It also fixes my "Tweak Names" problem.
Out of curiosity, I tried using vrsion 304.4. That version does not have the bug, so it is a regression.
Still a C-bug
, C-regression
means it broke between the last release (309.1) and dev.
8004d365abd69e24df7e8d2b917642cdfc80390b might be the culprit. Before that, the code used null1.join(...) + null1
wrapped in a packSub0
call that appended another null.
Thanks for detailed report @jwalt and Inf for TTT fix :P - as noted in that commit there is some work in 480-records-refactoring-pt3 on record string elements handling, bit it's currently blocked by #543 - there are a few open questions, so nice this is investigated now :)
For anyone curious why, they're most likely using a Windows API for string list loading/writing, something like this which shows up all over the place in API calls. I know we (you guys) fixed it, and it was right before: just another example of being careful with data structures involving null-terminated strings.
Report by Vrugdush (originally here on Discord):
Report by jonka (originally here on Discord and here on Discord):
Steps for reproducing: