xemu-project / xemu

Original Xbox Emulator for Windows, macOS, and Linux (Active Development)
https://xemu.app
Other
2.85k stars 283 forks source link

Failed assertion of bound surface size comparison #420

Closed gamrXerus closed 2 years ago

gamrXerus commented 3 years ago

Title

https://xemu.app/titles/4c41001a/#Star-Wars-Battlefront-II https://xemu.app/titles/49470017/#Gun-Valkyrie

Bug Description

Starting a split Screen Instant Action game crashes the emulator with xemu: ../hw/xbox/nv2a/pgraph.c:5598: pgraph_update_surface: Assertion `(pg->color_binding->width == pg->zeta_binding->width) && (pg->color_binding->height == pg->zeta_binding->height)' failed. Aborted (core dumped) Capture Happens on latest version of xemu on both Lubuntu 21.04 and Windows 11 Preview

Expected Behavior

Split Screen matches are supposed to go in game with playable fps.

xemu Version

Version: 0.5.4-39-gedf3735718

System Information

OS: Lubuntu 21.04 and Windows 11 Preview CPU: Intel Core i5 8600K GPU:AMD Radeon RX6800 GPU Driver:Mesa 21.3.0-devel and Radeon Software 21.7.1

Additional Context

No response

MasonT8198 commented 3 years ago

I'll look into seeing why this occurs

Firestriker326 commented 3 years ago

Is this issue also in Windows 10?

dracc commented 3 years ago

Yes, this is not a platform specific issue.

On Thu, 4 Nov 2021 at 09:32, Firestriker326 @.***> wrote:

Is this issue also in Windows 10?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mborgerson/xemu/issues/420#issuecomment-960550886, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJYNE6CYOSVPVDHWIB3C2LUKJHJVANCNFSM5AYQS6DQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

OtosNosliw commented 3 years ago

I can try and take a stab at this. Cloning repo now...

UPDATE: I may have bit off more than I can chew (I'm mostly familiar with C#/Python), but I'm willing to soldier on.. Any advice on where I can get started with compiling/building?

MasonT8198 commented 3 years ago

@OtosNosliw https://xemu.app/docs/building-from-source/

OtosNosliw commented 3 years ago

@OtosNosliw https://xemu.app/docs/building-from-source/

Of course! Not sure how I missed that. Thank you.

FWIW, as a(n admittedly sloppy) workaround for those craving to play splitscreen BFii, the assertion can be commented out and the game seems to run split-screen without a hitch.

gamrXerus commented 3 years ago

Thank you @OtosNosliw . Your workaround works in the meantime.

CyberAkumaZero commented 2 years ago

I don't have Battlefront, but I can confirm that this issue is still present for Gun Valkyrie on the latest release as of this post (v0.6.2-21-gd0f241c516 Dec 29, 2021), so that work-around appears to only work for Battlefront. This bug was initially assigned just to Battlefront but later had Gun Valkyrie added as well when I discovered the same bug is also there for Gun Valkyrie on Dec 2 2021.

I can also confirm that for Gun Valkyrie this bug was not present in older versions of Xemu. Gun Valkyrie was confirmed "Playable" back in Jan 11, 2021. I tried downloading that version of Xemu and the bug was not present, the screen goes completely blank with a brown color for about a second but then the game continues as normal, but now the emulator instantly crashes with that same error message posted in the screenshot in the opening post. I then tried several previous versions and found that Gun Valkyrie was working as Playable up until build-202107201737 released on July 20, 2021. The previous build (build-202107190558 released on July 19, 2021) and many before it going back to January 2021 and possibly older work for Gun Valkyrie.

The update notes for July 19 are listed as "nv2a: Support NV062_SET_COLOR_FORMAT_LE_Y32 blitting", that version still works. July 20 simply as the update notes listed as "ui: Allow rendering scale to go up to 10x". I am not sure if this change somehow caused this crash.

I also feel this should change Gun Valkyrie's compatibility listing from "Playable" to "Starts" since the boost is a vital function that is mandatory to play the game and thought to you in the beginning of the tutorial, the game is literally unplayable if you are unable to use boost.

abaire commented 2 years ago

The update notes for July 19 are listed as "nv2a: Support NV062_SET_COLOR_FORMAT_LE_Y32 blitting", that version still works. July 20 simply as the update notes listed as "ui: Allow rendering scale to go up to 10x". I am not sure if this change somehow caused this crash.

To save some time for whoever looks into this: https://github.com/mborgerson/xemu/compare/build-202107190558...build-202107201737 is the commit delta, and the most obviously relevant change is the addition of the assert itself.

For the gun valkyrie case, I see: pg->color_binding: w = 512, h = 384 pg->zeta_binding: w = 640, h = 480

Spottyfriend commented 2 years ago

Still very much an issue. A shame since split-screen is one of the reasons to play Battlefront II (and 1 where this issue also occurs) emulated rather than PC.

WholesomeIsland commented 2 years ago

said workaround mentioned above also works for Halo 2

CyberAkumaZero commented 2 years ago

Is this workaround included in any of the recent builds? I just tried the latest version of Xemu at the time of this comment and the issue with Gun Valkyrie is still there.

I mentioned multiple times in the Discord that Gun Valkyrie's compatibility status needs to be changed from Playable to Unplayable since thanks to this issue which was not present before the game instantly crashes if you use dash, which is like the equivalent of a NES emulator crashing if you make Mario run. Most people seem to just associate this issue with Battlefront 2.

WholesomeIsland commented 2 years ago

no. you need to clone the repo and change a certain line, (which for me was near the line 6000) in hw/xbox/nv2a/pgraph.c to become a comment and then build the line looks something like this: assert((pg->color_binding->width == pg->zeta_binding->width) && (pg->color_binding->height == pg->zeta_binding->height));

for it to work it should look like this: //assert((pg->color_binding->width == pg->zeta_binding->width) // && (pg->color_binding->height == pg->zeta_binding->height));

CyberAkumaZero commented 2 years ago

So I would need to compile a version myself then with those two lines commented out. Haven't had much luck successfully compiling most project before. Can anyone test if this workaround works for Gun Valkyrie too since it seems to work for Halo 2 and Battlefront 2?

Back in the older versions of Xemu before this crash the screen would blank out for a half second turning completely brown but the game still running. Trying it in Cxbx-reloaded causes a transparent spinning distortion around the screen. On a real Xbox there is a very subtle blur/dash effect when it is used. so this seems to be tied to some sort of filter/visual effect the game displays when doing the dash.

WholesomeIsland commented 2 years ago

needing to compile it yourself would be correct and instructions were posted above here they are again: https://xemu.app/docs/building-from-source/ it has commands that can be copy-pasted into a terminal to compile and it worked for me running arch linux

abaire commented 2 years ago

The compatibility report for https://xemu.app/titles/545100ef/#Juiced was just updated to indicate that it also hits this assert

Triticum0 commented 2 years ago

Also affects NCAA Football 07

CyberAkumaZero commented 2 years ago

Can anyone confirm if Juiced or NCAA 07 worked back on build-202107190558 from July 19, 2021? That seems to be when this assertation was last working, as future builds made change that caused this issue. If it was a larger number of games effected than previously thought, this would be a fairly problematic regression that went mostly unnoticed for this long.

WholesomeIsland commented 2 years ago

i can confirm, halo 2 works in splitscreen on the july 19th build the middle window is xemu running arch linux with gnome desktop to recreate issue with halo 2, launch splitscreen. on the july 19th issue, splitscreen works fine other than the visual issues addressed by other bug reports. Screenshot from 2022-04-13 18-44-14

leewickert commented 2 years ago

I still am having the same Assertation error on Halo 2 split screen. instant crash. #759

any advice on how you were able to get this to run? My Xemu is fully updated

WholesomeIsland commented 2 years ago

there are 2 options, building from source, or using xemu 5.4-29 (version from Jul 19 2021). both work, but building from source requires changing 2 lines of code,

a few comments ago i said,

no. you need to clone the repo and change a certain line, (which for me was near the line 6000) in hw/xbox/nv2a/pgraph.c to become a comment and then build the line looks something like this: assert((pg->color_binding->width == pg->zeta_binding->width) && (pg->color_binding->height == pg->zeta_binding->height));

for it to work it should look like this: //assert((pg->color_binding->width == pg->zeta_binding->width) // && (pg->color_binding->height == pg->zeta_binding->height));

these still work

Does anyone event know what this assertion accomplishes? Like what itls the purpose of the assertion being here?

WMXGroup commented 2 years ago

there are 2 options, building from source, or using xemu 5.4-29 (version from Jul 19 2021). both work, but building from source requires changing 2 lines of code,

a few comments ago i said,

no. you need to clone the repo and change a certain line, (which for me was near the line 6000) in hw/xbox/nv2a/pgraph.c to become a comment and then build the line looks something like this: assert((pg->color_binding->width == pg->zeta_binding->width) && (pg->color_binding->height == pg->zeta_binding->height)); for it to work it should look like this: //assert((pg->color_binding->width == pg->zeta_binding->width) // && (pg->color_binding->height == pg->zeta_binding->height));

these still work

Does anyone event know what this assertion accomplishes? Like what itls the purpose of the assertion being here?

Where can you download 5.4-29? Can't seem to find it anywhere

abaire commented 2 years ago

Does anyone event know what this assertion accomplishes? Like what itls the purpose of the assertion being here?

It's there because it's unexpected to have the depth buffer and color buffers be different sizes and the rest of the code was not necessarily built with that case in mind, meaning it may lead to crashes or inexplicable behavior/graphical bugs that would end up wasting hours of debugging just to find something that was known at the time of authorship.

Someone needs to write some tests to determine how the hardware deals with this case and then xemu needs to be updated to match that behavior, after which the assert can be removed.

jonnythedennis commented 2 years ago

Any updates with this? Really looking forward to some battlefront II splitscreen!

abaire commented 2 years ago

Any updates with this? Really looking forward to some battlefront II splitscreen!

When there are updates they will be posted here or the bug will get closed. There was an update 3 days ago saying somebody needs to write tests for this.

Please don't ping everyone who has commented asking for updates, I'm sure everyone is agreed that it'd be really nice to have a fix. You're more than welcome to start writing the necessary tests, otherwise you'll have to be patient until somebody feels inclined to do so.

WholesomeIsland commented 2 years ago

there are 2 options, building from source, or using xemu 5.4-29 (version from Jul 19 2021). both work, but building from source requires changing 2 lines of code, a few comments ago i said,

no. you need to clone the repo and change a certain line, (which for me was near the line 6000) in hw/xbox/nv2a/pgraph.c to become a comment and then build the line looks something like this: assert((pg->color_binding->width == pg->zeta_binding->width) && (pg->color_binding->height == pg->zeta_binding->height)); for it to work it should look like this: //assert((pg->color_binding->width == pg->zeta_binding->width) // && (pg->color_binding->height == pg->zeta_binding->height));

these still work Does anyone event know what this assertion accomplishes? Like what itls the purpose of the assertion being here?

Where can you download 5.4-29? Can't seem to find it anywhere

I cant find 5.4-29 either, building 5.4-29 from source or changing the current release seems to be the only option. can someone also confirm that the new xemu release works with the code workaround?

abaire commented 2 years ago

Thinking about this more, I don't think it's possible for the hardware to get into this state. The pgraph operations that modify surface size apply to both color and zeta at the same time.

From what I can see, Gun Valkyrie sets the surface pitch to 640x480, then sets the clip size to 512x384. It looks like xemu applies the clip to the color buffer but not the zeta buffer (at least leading up to the assertion failure). It should either apply the clip to both, to neither, or be flexible about this check if the zeta size is updated prior to use someplace after the failing assert.

Tracing through a bit, I see that we eventually get to a pgraph_update_surface_part call during handling of a SET_BEGIN_END command where upload is true, color is false, and pg->zeta_binding is 640x480. The pgraph_populate_surface_binding_entry call in that function correctly determines that the surface binding is for a 512x384 surface, but neither the buffer nor the memory region are marked as dirty, so the invalid surface is not evicted.

WholesomeIsland commented 2 years ago

the color_binding is the only thing relating to this problem affected by the pgraph_populate_surface_binding_entry function, if we add maybe like 2 lines of code to also set the zeta_binding width and height, it would probably fix the issue. if the zeta binding and the color binding are supposed to be the same thing, why isnt the zeta binding also set during this function too? I have made a fork to try and fix this issue. it seems like that function is the problem, and if not, i will try and track down the issue.

abaire commented 2 years ago

the color_binding is the only thing relating to this problem affected by the pgraph_populate_surface_binding_entry function

This is incorrect, if you spend a few minutes tracing the bug you'll see that there are several passes setting up both surfaces, it's not immediately clear why the zeta surface isn't evicted though it is likely because it is not considered dirty in the last pgraph_update_surface_part call. The populate surface binding entry operates on an arbitrary SurfaceBinding and has no direct connection to the currently bound color and zeta surfaces, blindly resizing things there won't do anything since it already has no effect on the bound surface. Blindly resizing things in pgraph_update_surface_part is also likely to lead to either incorrect behavior elsewhere or unnecessary performance drops.

I started writing a test case for this and found that there's some nuance beyond simply setting the surface clip region that causes the assertion. I expect that once I'm able to reproduce the issue it will become more obvious as to where xemu is going wrong.

EDIT: Apparently I'm wrong; I was able to reproduce the assert in this WIP test

WholesomeIsland commented 2 years ago

can you pinpoint exactly where in the test it crashes? It might help pinpoint where the issue is.

abaire commented 2 years ago

The final version of the test has it isolated.

xemu results differ substantially from HW when the framebuffer surface is set to 565 (which is what Gun Valkyrie is using), and using 565 (rather than ARGB8) in addition to the surface clip rect is required to generate the assert.

pbkit sets up the framebuffer as ARGB8 and the test (intentionally) does not modify this, causing HW to display 565 content as ARGB leading to color shift and a squashed image. xemu appears to modify the guest color surface => GL mapping such that it's actually displayed as 565 content. I doubt this is a contributor to the problem, and it is at least very unusual to do what I'm doing in the test, so that component of the incorrect behavior may not be worth pursuing.

abaire commented 2 years ago

The issue is that the pgraph_check_surface_compatibility check in pgraph_update_surface_part for the color buffer fails as the buffer format has changed from ARGB to R5G6B5. This triggers eviction of the color surface, but does not touch the zeta surface, which was previously allowed to persist due to the fact that the compatibility check is non-strict wrt. size.

It is straightforward to add a check in the should_create path that marks the zeta surface as buffer_dirty if the sizes misalign but it's not yet clear to me that this is the correct thing to do.

Triticum0 commented 2 years ago

Also affect Jade Empire #694

DeathWrench commented 1 year ago

I'm getting some major visual bugs, SWBF2 in split-screen: image Player 2 has no sky, tracers and effects accumulate over time. Most things do not render until pretty close for Player 2.