thobbsinteractive / magic-carpet-2-hd

Recode Binary code of game Magic Carpet2 to C/C++ language(remake MC2 for any platform)
GNU General Public License v3.0
29 stars 5 forks source link

In HD (1920 x 1080) Game often crashes, over running a protected memory location drawing rotated sprites #74

Closed thobbsinteractive closed 1 month ago

thobbsinteractive commented 2 years ago

I have seen this a few times. Setting the Window and Game Res both to HD, I have found that somewhere there is a memory leak/overrun.

thobbsinteractive commented 2 years ago

So the error is occurring in DrawSprite_41BD3(...)

Exception thrown: read access violation.
v106 was 0x1C15D2FD.

Steps:

  1. In the config.ini using the following settings:
    windowResWidth = 1920 ; Window resolution, cannot be lower than gameRes
    windowResHeight = 1080 ;
    gameResWidth = 1920 ; In Game resolution. If using software render keep this low
    gameResHeight = 1080 ;
  2. Start the game, start first level, fly towards the objective
GrimSqueaker commented 2 years ago

I have noticed a glitch with the Full-HD settings before. In my case it did not crash but when flying down from the city to sealevel in level 1 the terrain was filling the whole screen for a moment. I will check if I can get valgrind running again which can detect code that does access array out-of-bound and much more.

thobbsinteractive commented 2 years ago

I have logged the Terrain thing already, it is the reflection geometry going a bit nuts. The crash does happen, sadly it varies on Windows. I just played the whole of level 1 and part of level 2 before it happened.

I think something else is leaking perhaps overwriting x_BYTE_F6EE0_tablesx

thobbsinteractive commented 2 years ago

So I just checked the contents of x_BYTE_F6EE0_tablesx before and after the crash, it is the same. so the issue is not corruption of x_BYTE_F6EE0_tablesx contents. It might be some kind of indexing issue.

turican0 commented 2 years ago

I think the problem is already in the original game. Some objects are drawn off screen, out of framebuffer. The way the original game solved this was that there was nothing around the screen memory that couldn't be overwritten, just unused memory. I solved this by allocating more memory before and after framebuffer.

thobbsinteractive commented 2 years ago

Well done!

On Tue, 21 Jun 2022, 08:34 turican0, @.***> wrote:

I think the problem is already in the original game. Some objects are drawn off screen, out of framebuffer. The way the original game solved this was that there was nothing around the screen memory that couldn't be overwritten, just unused memory. I solved this by allocating more memory before and after framebuffer.

— Reply to this email directly, view it on GitHub https://github.com/thobbsinteractive/magic-carpet-2-hd/issues/74#issuecomment-1161322265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETQQFANYNLEJ2WDATQWY6TVQFPA7ANCNFSM5ZJVGQLQ . You are receiving this because you authored the thread.Message ID: @.***>

thobbsinteractive commented 2 years ago

Should we try and fix the original issue? Rather than relying on empty memory?

On Tue, 21 Jun 2022, 09:11 Timothy Hobbs, @.***> wrote:

Well done!

On Tue, 21 Jun 2022, 08:34 turican0, @.***> wrote:

I think the problem is already in the original game. Some objects are drawn off screen, out of framebuffer. The way the original game solved this was that there was nothing around the screen memory that couldn't be overwritten, just unused memory. I solved this by allocating more memory before and after framebuffer.

— Reply to this email directly, view it on GitHub https://github.com/thobbsinteractive/magic-carpet-2-hd/issues/74#issuecomment-1161322265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETQQFANYNLEJ2WDATQWY6TVQFPA7ANCNFSM5ZJVGQLQ . You are receiving this because you authored the thread.Message ID: @.***>

turican0 commented 2 years ago

Maybe it wasn't entirely a bug, but a feature :) If an object only encroaches part of the render box, it renders partly into the framebuffer and partly out. They didn't cut it for speed, but solved it with free memory around the framebuffer. If they use a more modern rendering method that ignores the points outside the framebuffer it might help. But watch out for reflections and shadows, now I don't know if points outside the framebuffer are needed for them.

thobbsinteractive commented 2 years ago

Noted, I'll see if I can allocate a similar amount of space scaled up to the new max resolution tonight.

I will bet rotation of sprites is the issue, it is probably much harder to cull them off in those cases.

I am still amazed how fast you diagnosed the issue.

On Tue, 21 Jun 2022, 09:39 turican0, @.***> wrote:

Maybe it wasn't entirely a bug, but a feature :) If an object only encroaches part of the render box, it renders partly into the framebuffer and partly out. They didn't cut it for speed, but solved it with free memory around the framebuffer. If they use a more modern rendering method that ignores the points outside the framebuffer it might help. But watch out for reflections and shadows, now I don't know if points outside the framebuffer are needed for them.

— Reply to this email directly, view it on GitHub https://github.com/thobbsinteractive/magic-carpet-2-hd/issues/74#issuecomment-1161375851, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETQQFFDAR45O65OXMCO4E3VQFWURANCNFSM5ZJVGQLQ . You are receiving this because you authored the thread.Message ID: @.***>

thobbsinteractive commented 2 years ago

Okay so as a half-ass measure I tried the following code in GameRender.h

    type_unk_F0E20x m_str_F0E20x[1920 + 200]; // Originally 640
    type_unk_F0E20x* m_ptr_str_F0E20x = m_str_F0E20x + 100;

and changed all references in GameRender.cpp to m_str_F0E20x to use m_ptr_str_F0E20x. Unfortunately even with a buffer of 100 items either side of m_str_F0E20x[] the error is still occurring. It seems to happen most at the end of the level during the blur effect to the Portal.

I will look into it a bit deeper tomorrow, but I am open to more suggestions.

thobbsinteractive commented 2 years ago

So I have just realized that m_str_F0E20x is not a frame buffer, pdwScreenBuffer is (it is passed into the GameRender.cpp). I will try looking at that

thobbsinteractive commented 2 years ago

So the simplest solution for this just occurred to me. We add an exception handler. When the problem occurs, you can skip over the code and the game continues fine. Obviously this is not the ideal fix, adding more padding is, but it might work well and the worse case (assuming the try catch does not slow down the code too much) is that the sprite is not drawn for a frame.

thobbsinteractive commented 2 years ago

So since this is an OS generated error, a try catch will not work, likely because such errors are out of scope of the application. As the Germans say: What Learned! So we are back to either refactoring the code or trying the buffer padding fix.

turican0 commented 2 years ago

This probably won't work because when the image is rendered into memory with other data, the data will already be corrupted, which skipping won't save. You'd have to back up a chunk of memory before each frame and restore it again if it failed, but that's a bitch :) A better solution is to watch the coordinate of each rendered line and skip it if it's out of the image.

thobbsinteractive commented 2 years ago

Well, the manual skip in the IDE did actually work. I was able to play the rest of the level to completion. But there just is not a way to catch an OS exception in the code.

thobbsinteractive commented 2 years ago

Okay, so on this branch I have checked-in a solution where I increased the padding around the screen buffer by about 3mb either side. The old code was only had padding before. The good news is I was able to play a whole level. The bad news is it still crashed at the exit portal (probably because a big sprite is at/in the camera). So it looks like the best way forward is to go with the most difficult plan of cleaning the sprite render code and ensuring that it does not go out of bounds.

turican0 commented 2 years ago

But if you think you can finish an OpenGL render, isn't it better to leave it alone and pursue GL rendering?

thobbsinteractive commented 2 years ago

Buddy, I need to start Open GL! It was meant to happen this week but I got caught up doing other things on the project. I have a few days after my holiday to work on it.

thobbsinteractive commented 1 year ago

These are the lines that allocate the screen buffer and the area around it. pre_pdwScreenBuffer_351628 = (uint8_t)malloc(16588800); // (1920x1080 4) 2 pdwScreenBuffer_351628 = &pre_pdwScreenBuffer_351628[8294400]; // 1920x1080 4

Changes these to a bigger size might solve the issue but use up more ram. Preferably the we would stop the sprite code from writing outside the ScreenBuffer

thobbsinteractive commented 11 months ago

Even increasing pre_pdwScreenBuffer_351628 to nine times the size of the pdwScreenBuffer_351628 and placing pdwScreenBuffer_351628 in the middle of the allocation, Large creatures like the Wyvern still cause this crash. There must be another buffer out there that the sprites are using. So, no quick fix.

thobbsinteractive commented 2 months ago

So I have been looking into this again recently. It seems to be accessing memory ranges outside the Sprite memory ranges that it is drawing. str_F2C20ar.dword0x02_data So I put in a check to ensure that with each loop we do not exceed the memory range e.g.

if (v51x < &str_F2C20ar.dword0x02_data[0] || v51x > &str_F2C20ar.dword0x02_data[str_F2C20ar.dword0x08_width * (str_F2C20ar.dword0x0a_actIdx >> 16)]) break; But then the sprites did not draw at all. The app also still crashed just as much. I also tested the aspect ratio to see if that was an issue (unlikely as the map screen renders at a different ratio), but it made no difference.

At this point I need to understand better how sprite rendering is done in general. Larger resolutions mean bigger sprites near the camera, perhaps we need more memory allocated around the sprite. Perhaps scaling is causing unallocated areas to be accessed.

thobbsinteractive commented 2 months ago

So some things I have managed to eliminate as a cause.

It is not the aspect ratio, the error happens no matter what screen ratio is used.

It is not the amount of memory around the screen buffer. I have captured this data and checked it by writing it to a bmp when the crash happens. Very little sprite of it is written outside the buffer. Only the odd line.

So I think what we have is an issue of type size. Given I have code that checks ranges now for every write to the buffer, which runs well until a completely random part of the game now crashes (usually SDL or music play back) I think we may have a uint8, int8 pointer basically going out of range and back to Min/Max value.

To test this problem you can change the game res to 1920x1080 or higher, fly up to a group or trees and circle at a step angle as close as possible.

thobbsinteractive commented 1 month ago

This might be worse than I thought. Looking at a recent crash, I manually checked what should have been assigned to v51x. The code assigning the pointers value was correct, somehow another value was written over address the pointer was pointing too.

This likely means there is a memory leak elsewhere corrupting the data.

This also lines up with the testing of code I wrote to range the memory locations we were writing to/reading from. When all the bad pointers were skipped we then got crashes in SDL, Music and other random places.

So this is going to be hell to track down. With the change of resolution, somewhere something is writing over memory. I do know disabling the sprite drawing code completely does "seem" to fix the issue, but all this could mean is that we are not access the corrupt locations. Not that the issue is created in the sprite drawing code

thobbsinteractive commented 1 month ago

I do believe I fixed the sprite memory leak. It used a buffer of a set size (307k I think (640 * 480), and the game was reusing this allocated memory with hardcoded Offsets for the sprites (12%, 14% and 20% of 307k). When I increased resolution, those offsets were being over written by the bigger scaled sprite data causing leak as memory addresses in the data was corrupted. I simply changed the buffer to 8mb and made the offset accordingly bigger and its fixed. No need for range checks or anything.

thobbsinteractive commented 1 month ago

Testing in 2k has shown more bugs. On the exit of the level1, the blur effect instantly crashes drawing the sky. One time opening the spell menu crash in its drawing Some times getting killed and respawning causes an issue with "Already dereferenced" memory

thobbsinteractive commented 1 month ago

So the crash after death happens in 960x540. So it is not new. But I could not reproduce it in 640x480 so it is related to resolution. It happens in sub_71F20 in TextureMaps.cpp. Address Sanitizer Error: Use of deallocated memory. Turning off Address Sanitization might skip the error, but really we should fix it

thobbsinteractive commented 1 month ago

I think I have fixed the blur problem by using a new buffer (the existing code re-uses another buffer of a set size though out the code)