Closed AMDmi3 closed 8 years ago
I can't really reproduce this.
Does it happen if you start the level via console with map game/hell
?
I'll check that a bit later.
While here, let me add that I could avoid the crash by removing MA_Free call from idRenderModelStatic::LoadMA. Of course that leads to memory leak, but it may also confirm that there could be a case of use-after-free or double free here. Also note that I use FreeBSD, so it may reveal problems which don't show themselves on Linux.
Hello, after a long a debugging session I've traced this down to the Model_MA parser. The long story is: Like most games Doom 3 has it's own memory management system, build upon the libc malloc(). This system works by allocating 'pages' trough malloc(), the pages are organized in a kind of linked list. Each list entry consists of a header with some metadata like the free space left, the previous entry, the next entry and so on. When the game parses .ma files, new memory is allocated through the system. For reasons not analyzed at this point the Model_MA parser has a buffer overflow or something like that when parsing models/david/hell_h7.ma. That file is used only at the last level of RoE, maps/hell.map. On FreeBSD the Model_MA parser overwrites the header of the next entry in the 'pages' list. When the games tries to free that page, the e->page field of the overwritten header contains garbage. It's dereferenced as a pointer, the memory address is invalid and a segfault is triggered. On Linux the problem is only visible with the Clang / LLVM address sanitizer or other tools. Model_MA writes into unallocated memory, the game doesn't crash by pure luck.
Regards, Yamagi
P.S.: A work around is to hack the file away by adding something like this to Model_ma.cpp line 1010
if (strcmp(fileName, "models/david/hell_h7.ma") == 0)
{
idLib::common->Printf("Bla!!!!!\n");
return NULL;
}
Thanks a lot for debugging, Yamagi!
Another workaround that seems to work in this case is to make dhewm3 only use malloc()/free(), not its own heap. This can be done by adding #define USE_LIBC_MALLOC 1
at the top of neo/idlib/Heap.cpp
. Not sure why that works, probably standard malloc() adds more padding, so nothing critical is overwritten.
Of course, the bug should be fixed properly (I hope to investigate more thoroughly next week, possibly this weekend with Yamagis help, if I can't figure it out before). With clang's AddressSanitizer I can at least reproduce it (and get the crash/ASAN error when the heap corruption happens and not in MediumFree(), when it's too late to find out what caused it. Unfortunately in -O0 builds all function arguments are reported as <optimized out>
in gdb, even more than in optimized builds. I guess I should test a newer clang version, I'm still using 3.6).
So, are you saying it's loading Maya models directly? (I know it can, to convert them into MD5) I don't recall seeing a single Maya model in Doom 3 o.O
Doom 3 is full of surprises :)
I think this should be fixed in the latest git code. Could you test it? :-)
@motorsep yep, RoE has several .ma
files, while the main game has none.
Fix confirmed, thank you very much! I was finally able to finish RoE.
This has also fixed another crash when fighting Maledict, if Hell loading problem is circumvented by removing MA_Free as mentioned. I assume it was caused by the same heap corruption.
Is 9950a57 enough to fix it? I want to backport it to FreeBSD port of dhewm3 (1.4.0) I maintain.
Also it would be nice to have 1.4.1 sooner :)
yeah, that commit should be all that's needed to fix it (I only now noticed I committed a comment there I didn't want to commit.. whatever.)
I guess I could do a 1.4.1 release candidate in the next days, I don't have plans for any more changes for 1.4.1 anyway. There's #137, but I'm not sure that's easily fixable (and probably the behavior has been the same since doom3 was released).
Oh, and at some point I want to improve mod support, but that can wait for 1.4.2.
Just tested: 9950a57 alone fixes level loading, but Maledict fight crash is unrelated
Assertion failed: (b[1][0] - b[0][0] > a.b[1][0] - a.b[0][0] && b[1][1] - b[0][1] > a.b[1][1] - a.b[0][1] && b[1][2] - b[0][2] > a.b[1][2] - a.b[0][2]), function operator-, file /usr/work/ssd/portstrees/batchports-mem/games/dhewm3/work/dhewm3-1.4.0/neo/idlib/bv/Bounds.h, line 175.
and it was fixed by b03fc92. I'm backporting both commits for now. Thank you again!
The game crashes during loading last level of RoE (hell).