hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
11.24k stars 2.17k forks source link

GTA : LCS - All character and vehicle models are invisible on M1 Mac #15149

Closed patrickmetz closed 1 year ago

patrickmetz commented 2 years ago

Game or games this happens in

ULUS-10041 GTA:Liberty City Stories

What area of the game / PPSSPP

One minute into the game when one watches the intro, it becomes clear that all characters and vehicles are invisible most of the time, and only partly "flash" into existence for very brief periods. Broken Models

I also tested OpenEmu which uses 1.10.3. And that version is not affected and shows all characters and vehicles correctly. But in OpenEmu one cannot even access any of PPSSPP's settings, so using that "frontend" is rather pointless.

Also you guys do not provide 1.10.3 readily compiled for M1. Brew provides PPSSPP 1.10.3, too, but crashes during installation of PPSSPP. See [#72631].(https://github.com/Homebrew/homebrew-core/issues/72631) And RetroArch for M1 is missing the PPSSPP core entirely.

So the only way to play this game is OpenEmu, which is not configurable and uses 1x-PSP-resolution -.-

What should happen

Characters and vehicles should be clearly visible, when they are on screen.

Logs

No response

Platform

macOS

Mobile phone model or graphics card

M1

PPSSPP version affected

1.12.3

Last working version

1.10.3

Graphics backend (3D API)

OpenGL / GLES and Vulkan

Checklist

Panderner commented 2 years ago

maybe related to #14514?

patrickmetz commented 2 years ago

Ok, meanwhile I managed to get at least version 1.11.3 running on M1 Mac, which also does not have the mentioned issues, by fixing homebrew's failing install of PPSSPP.

Seems homebrew broke because I upgraded to Monterey, so I had to enter xcode-select --install to upgrade my dev tools and brew remove ppsspp to remove rests of older installs, and then brew install ppsspp, which then worked. (Source: https://github.com/Homebrew/discussions/discussions/673#discussioncomment-329689)

Finally I manually started /opt/homebrew/Cellar/ppsspp/1.11.3_1/PPSSPPSDL.app. And pinned the running app to the dock, for quick access, via right clicking on its dock icon.

So I'm currently enjoying the game by means of an "unofficial" homebrew build.

LunaMoo commented 2 years ago

This could simply be broken Vertex Cache which I guess affects OGL on 1.12.3. You can try artifacts on the bottom of this: https://github.com/hrydgard/ppsspp/actions/runs/1462943557 which is latest and includes macOS build. Or simply disable vertex cache.

patrickmetz commented 2 years ago

@LunaMoo thanks for trying to help.

I tried 1.12.3 with and without vertex cache on Vulkan and OpenGL

And I tried the artifact 1.12.3-232-gb6e7fe1aa only with vertex cache enabled, because the option cannot be disabled there, also on both Vulkan and OpenGL.

The result for both of these mentioned versions is, that characters and vehicles are still invisible, except for tiny time spans.

unknownbrackets commented 2 years ago

If you can compile, it'd be very interesting if you could try a few more builds. See here: https://github.com/hrydgard/ppsspp/wiki/How-to-bisect-to-find-what-broke-a-game

This does seem similar to #14514, but that happens in v1.11.3. Both seem likely to be graphics driver bugs. If we can know what change introduced this issue, it might help lead to a fix for both.

I don't have the affected devices, but the more you can narrow the haystack down via bisecting, the better. There are about 1700 commits between v1.11.3 and v1.12.3, but with 4 tests you could narrow that down to 100 commits. With a total of 7-8 tests you could narrow it down to 10 or less. That would probably point to where the problem is coming from.

-[Unknown]

hrydgard commented 2 years ago

Confirmed, and it's not working on any version I'm able to build on M1. Very weird!

unknownbrackets commented 2 years ago

If you could post a framedump where it's definitely happening, I'd like to check what's rendering differently for the objects that aren't visible. Since I won't reproduce the problem (don't have M1), a screenshot showing that same framedump (just load it as if it were a game) would be necessary.

The simpler the scene, the better.

It seems like there must be something unique about those objects.

Software renderer ought to be able to validate that it's not CPU related at all. I assume things will show fine there, if slow.

-[Unknown]

hrydgard commented 2 years ago

Need to fix our framedumping to handle GTA's rather unique presentation method first, but will try to get around to it soon.

patrickmetz commented 2 years ago

I just took three framedumps of the bus arriving in the intro scene. On OpenGL, Vulkan and software-rendering. The Bus is invisible in all three of them.

framedumps.zip

EDIT: on PPSSPP 1.12.3, M1 Mac

hrydgard commented 2 years ago

This does happen with the software renderer, so whatever the issue, it goes deep...

BlackGor commented 2 years ago

I don't know if it can help but i have installed a older version on my iphone (1.9.3) and two of GTA games was fixed , no more invisible characters or graphical issues.

hrydgard commented 2 years ago

Yeah the problem seems to be unique to ARM builds of PPSSPP for Mac. Those older builds would have been x86-64 builds running under Rosetta, which for some reason is not affected. Thanks for reporting though!

DTibor1986 commented 2 years ago

I am also experiencing the invisible characters on Intel based Mac, on Big Sur 11.6.1. Tried 1.12.1-3. all does it.

Maybe a noob question, but how can I rollback to 1.10.3?

unknownbrackets commented 2 years ago

It should be possible to rollback using Xcode. As far as Brew, it seems like you're meant to clone an older version of the Brew repo (which I've done before but is annoying and slow.) You'd basically be trying to run an x86 build through Rosetta, which works.

Since updating the unit test and headless tests, can anyone with M1 try running them and report if any fail? That might give us a much more controlled example of failure that would be easier to fix.

Also, as I remember, it seems more or less like a CPU/math issue. Some ideas to validate:

My goal is to get a frame dump of EXACTLY the same scene from the PC and M1 devices, so we have two frame dumps that ought to be identical, but aren't. Then we can compare and at least know where they're different - matrix values, vertices, something else? But it is important that they would be identical if it weren't for this bug, or it won't help.

Looking at this: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms

Arguments are slightly different, but it doesn't seem to impact our sin/cos calls. Probably not related anyway, since this occurs even with all jits off.

-[Unknown]

unknownbrackets commented 2 years ago

Another idea / question: if we compile a Debug build (not RelWithDebInfo), or otherwise compile with -O0, does anything change?

-[Unknown]

hrydgard commented 2 years ago

Well the issue is that framedumps are a mess in this game, making it hard to judge. I'm gonna use the mentioned methods to look into it again soon, though, and possibly fix the framedump problem too (and yup, also check unoptimized)

unknownbrackets commented 2 years ago

Well, the framedumps from the "EDIT: on PPSSPP 1.12.3, M1 Mac" comment do work fine (they have two frames in them, but it's hard to tell exactly why - maybe it's actually rendering two frames per vblank right now...) I do think we should fix it to identify "frames" better, but it's hard because of the various presentation methods...

Ideally I just want a PC version of that same scene. I see a lot of cases in that dump of identical verts in prim calls, but I have seen games do that sometimes to null out verts.

-[Unknown]

timmawsw commented 2 years ago

Hey, since I am also having this issue (and the similiar issue on GTA Vice City Stories) I would like to help finding the cause. Can I do something for you in order to get you a step forward in this issue? Or has the issue already been identified?

FYI: I do have a Windows PC, M1 and can also test the OpenEmu version against it. Just tell me how long a scene should be and which versions/settings you need the emulator to be on.

timmawsw commented 2 years ago

Hey, so far I found out:

  1. m1 dump works on windows and displays the same graphic bug there
  2. windows dump works on m1 and displays the same correct graphics there

See file attached to test it. recordings.zip Next up is running the unit tests

timmawsw commented 2 years ago

Hello, attached the unittest result.

unittest-m1.log

If you need anything else, just ask me. I can for example give you a reproducable dump (With 1 Weapon the upper body of the character always disappears when aiming on a target and always reappears when stopping to aim)

unknownbrackets commented 2 years ago

Windows, green shirt drawn at 1086/1983: Bone 1 - Bone 5 are NAN on M1, but shouldn't be.

They are explicitly set to NAN in the dump at 09FFA63C, using 2B 7FC000, which is a NAN. On Windows, these are not NAN, and that's the cause of the missing rendering. So it's pretty clear something is resulting in NAN that shouldn't, and likely it's some CPU calculation to generate this display list.

Since the CPU core doesn't affect it, I think an option might be:

  1. Get to this scene, or very close, and create a save state on Windows.
  2. Make sure the save state generates the issue on M1, but doesn't have the issue on Windows (I assume this will be the case.)
  3. Switch both Windows AND M1 to CPU interpreter (not IR or JIT)
  4. Log the NANs produced in WriteVector per instruction name using this branch: https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:m1-nan-log
  5. Compare the outputs. This may tell us if there are certain instruction(s) producing NANs only on M1. It's likely there will be multiple (NAN in = NAN out), but at least it gives us a subset to investigate.

-[Unknown]

hrydgard commented 2 years ago

That's a really good idea! Missed that you made a branch first so made my own, https://github.com/hrydgard/ppsspp/compare/master...interpreter-nan-check, might combine them. Tested mine on PC to make sure no spurious nans are logged (had to add some exclusions), will test on Mac and track this down tonight.

hrydgard commented 2 years ago

Alright, had to just have a quick look before doing other stuff today. first hit is Int_VScl which doesn't feel like a likely root cause. Could also be one of the scalar VFPU instructions, they would not be caught by WriteVector... I'll mess around more later.

unknownbrackets commented 2 years ago

Outside transfer/lsu, everything should use WriteVector or WriteMatrix (to apply prefixes.) Maybe worth looking at matrix too, though. Perhaps more likely there (like that compiler bug for LBP back in the day...)

Vscl does seem pretty unlikely. Maybe FPU is more likely than I thought...

-[Unknown]

hrydgard commented 2 years ago

Mac NaN stats on that first scene with the mob boss:

new frame, vfpu nans produced:

That's a lot. I do think many are coming from an unused fourth channel, especially vh2f / lv.q.

PC:

new frame, vfpu nans produced:

Yeah, that confirms my theory about vh2f/lv.q, but doesn't quite help pinpointing the source... Six candidates (in addition to matrices and FPU), hm.

In non-surprising news, USE_VFPU_SQRT doesn't fix it.

unknownbrackets commented 2 years ago

I pushed a new version of the branch which does matrix values and tries to avoid logging for nan propagation (didn't really test it, though.)

If VScl is the culprit, maybe it's related to swizzle? In that case, this gets tricky but I'm thinking what I added in the latest commit. That will log the Vscl # per frame that resulted in NAN. If we can catch it right after the save state, we could do something like if (my_isnan(d[i]) || vsclNum == 12345) { for a cleaner compare.

https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:m1-nan-log

Maybe we'll find it has something to do with denormals?

-[Unknown]

hrydgard commented 2 years ago

I had to add null checks in ReadVector, like this: if (my_isnan(rd[i]) && nanState) since some methods passed in nullptr.

Also now I'm properly using a PC savestate on mac, which I should have done from the beginning. Even so:

I feel vrsq is suspicious since it's commonly used to produce a multiplier later used by vscl... But then turning on the accurate version should have helped. And still all those from lv.q...

PC output:

`new frame, vfpu nans produced:

Mac output:

   * lv.q = 5460
   * vh2f = 5
   * vrsq = 34
   * vscl = 68
 Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000)
 Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000)
 Vscl #24 produced NAN - 0.008423 * nan (3c0a0000 * 7fc00000)
 Vscl #24 produced NAN - 0.006859 * nan (3be0c000 * 7fc00000)
 Vscl #24 produced NAN - -0.094788 * nan (bdc22000 * 7fc00000)
 Vscl #24 produced NAN - -0.995605 * nan (bf7ee000 * 7fc00000)
 Vscl #25 produced NAN - 0.011337 * nan (3c39c000 * 7fc00000)
 Vscl #25 produced NAN - 0.006248 * nan (3bccc000 * 7fc00000)
 Vscl #25 produced NAN - -0.101562 * nan (bdd00000 * 7fc00000)
 Vscl #25 produced NAN - -0.994629 * nan (bf7ea000 * 7fc00000)
 Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
 Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
 Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
 Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
 Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
 Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
 Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
 Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
 Vscl #31 produced NAN - 0.000000 * inf (00000000 * 7f800000)
 Vscl #31 produced NAN - 0.000000 * inf (00000000 * 7f800000)
 Vscl #32 produced NAN - 0.008133 * nan (3c054000 * 7fc00000)
 Vscl #32 produced NAN - 0.006638 * nan (3bd98000 * 7fc00000)
....
unknownbrackets commented 2 years ago

Ah sorry (I did say it was untested...)

Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000) makes me think that some vrsq resulted in zero or infinity and maybe that's the whole problem. It could be something else is producing one of those values, and not actually NAN.

You tried USE_VPFU_SQRT, though, hm. Could it be clz32_nonzero also? But how could both paths be broken the same way? So I guess that's indeed a pretty unlikely candidate. We could give it the same logging treatment, I guess, to see if it gets different inputs.

None of the Vscl outputs seem incorrect, at least, so it seems like a victim not a perpetrator.

Since we're using the PC save state, it shouldn't be possible that the values are enshrined in RAM, but the lv.q number implies it's coming from outside. I guess that makes FPU the next most likely suspect...

-[Unknown]

iota97 commented 2 years ago

Is it possible to debug this like in Ghost In The Shell here: https://github.com/hrydgard/ppsspp/issues/12519#issuecomment-609075417?

unknownbrackets commented 2 years ago

Using WINE won't help because it doesn't happen with Rosetta. Memory breakpoints can be used, but I guess this is a reason to enhance the web debugger and hope web restrictions on non-SSL don't increase.

But yes, there should be a GE command that's setting the matrix values to NaN and if we could trace that back, that'd find the instruction causing this.

Since pspautotests all pass, it'll be really good to capture whatever this is when we find it as a test...

-[Unknown]

iota97 commented 2 years ago

FWIW for Ghost in the Shell I used the "native debugger" with the breakpoints and checked the same addresses with the web debugger connected to the Android device for the first difference.

It might be possible to have the breakpoints on Windows while checking the values at the same addresses on the M1 with the web one.

I guess it's highly game based if this is viable tho', I was probably just lucky with Ghost in the Shell eheh :)

hrydgard commented 2 years ago

This one is still super mysterious but doesn't affect our official releases at this time, so taking off the milestone.

unknownbrackets commented 2 years ago

Has anyone been able to try this on an M2 device? Wondering if it's affected differently.

Also, happened to see this and wondered if maybe it could be related: https://github.com/martin-cs/symfpu/issues/5

-[Unknown]

hrydgard commented 2 years ago

Hm, possibly. The last comment says their tests fails on Intel though, while GTA has no problems on Intel Mac's...

unknownbrackets commented 1 year ago

I pushed a new version of the debugging branch: https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:ppsspp:m1-nan-log?expand=1

This one logs FPU nan production as well as vrsq 0/inf production.

-[Unknown]

DTibor1986 commented 1 year ago

I don't now if it is related to this issue, but both LCS and VCS crashing after intro on 1.14.3. (both with Vulcan and OGL) I am using an M1 Mac.

unknownbrackets commented 1 year ago

It would help if you could try earlier git builds and find when it started crashing.

-[Unknown]

hrydgard commented 1 year ago

Hm, no problems getting in-game in a fresh build in either Vulkan or OpenGL, on my Macbook Air M1.

DTibor1986 commented 1 year ago

I have just tried with 1.14.4 and still crashing both games. Will check earlier builds. (Other games are OK) Btw I am using an M1 Max with Monterey, but this shouldn't matter if is OK with your M1 Air.

Anyway many thanks for your prompt reply!

DTibor1986 commented 1 year ago

1.14-9-g8ed87d48e, Merge pull request #16589 from sum2012/patch #4962 was the last build that I can get in game with invisible bus + characters in the beginning cutscene.

DTibor1986 commented 1 year ago
Képernyőfotó 2023-01-03 - 21 07 55 Képernyőfotó 2023-01-03 - 21 07 18

some pictures about the glitch

unknownbrackets commented 1 year ago

1.14-9-g8ed87d48e, Merge pull request #16589 from sum2012/patch #4962 was the last build that I can get in game with invisible bus + characters in the beginning cutscene.

Does that mean v1.14-13-g5406c4f97a doesn't work? That didn't really have any changes except a somewhat minor one to the UI. Are you downloading these builds from GitHub or building locally?

We know the glitch still happens, although we don't know why. More pictures won't help at this time. Some math equation is calculating "not a number" as the result instead of the expected number, and it's only happening on M1. But the equation is complex and involves a lot of equations within the game's code, so we're not quite sure what specific step the M1 macs are calculating an unexpected result for.

-[Unknown]

DTibor1986 commented 1 year ago

correct, 1.14-13-g5406c4f97a the first one that doesn't work. I am downloading the artifacts from GitHub.

As for the glitch I know that you are aware of the M1 specific problem, that it isn't easy to track back. (only included the pics to remember myself which was the last "working" build)

Kethen commented 1 year ago

For some reason, on intel hardware, with the same version of moltenvk obtained from https://github.com/hrydgard/ppsspp/releases/tag/v1.14.4, the issue is observed on the native build but not windows build ran under cxwine 22.0.1 (https://github.com/Gcenx/WineskinServer/releases)

Screen Shot 2023-01-17 at 9 42 42 PM

Screen Shot 2023-01-17 at 9 43 30 PM

Screen Shot 2023-01-17 at 9 44 53 PM

hrydgard commented 1 year ago

Wow, that is a super interesting result! Rules out that it's ARM64-specific!

What if you use software rendering on that setup?

galad87 commented 1 year ago

Just a shot in the dark: could it be the same issue cemu had, regarding Metal alignment requirements?

https://github.com/cemu-project/Cemu/pull/534 https://github.com/cemu-project/Cemu/pull/445

Kethen commented 1 year ago

Software rendering enabled

Screen Shot 2023-01-18 at 11 32 04 AM Screen Shot 2023-01-18 at 11 33 02 AM Screen Shot 2023-01-18 at 11 34 33 AM Screen Shot 2023-01-18 at 11 36 16 AM Screen Shot 2023-01-18 at 11 52 43 AM

The pairs of floating legs on the final screenshots does resemble https://github.com/hrydgard/ppsspp/issues/14514

With software rendering disabled for a higher resolution rendering

image

unknownbrackets commented 1 year ago

Does #16816 help that, then? This applies the alignment from a quick look at the cemu pull request.

I'm not sure how that'd help the NANs we saw, but maybe this indicates there's some misaligned read involved. Worth trying that branch with hardware transform off, jit off, etc.

-[Unknown]

Kethen commented 1 year ago

Bisecting on intel hardware, 0ccc63b43edba37effb7d3b0aac6e88ce9e6791a is where the issue started, while it was issue free up to dbe6658803ff4937c0ea2f3d1ea26c33fed9d31a

So it looks like somehow apple provided compilers just breaks the vfpu sin/cos implementation introduced at https://github.com/hrydgard/ppsspp/pull/14406, at least for intel x86 hardware

iota97 commented 1 year ago

Well, from cppreferce: "It is undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union."

I wonder if reinterpret_cast works fine...

Edit: reinterpret_cast is UB too, yeah just memcpy seems fine (or C++20 bit_cast).