Closed patrickmetz closed 1 year ago
maybe related to #14514?
Ok, meanwhile I managed to get at least version 1.11.3 running on M1 Mac, which also does not have the mentioned issues, by fixing homebrew's failing install of PPSSPP.
Seems homebrew broke because I upgraded to Monterey, so I had to enter xcode-select --install
to upgrade my dev tools and brew remove ppsspp
to remove rests of older installs, and then brew install ppsspp
, which then worked.
(Source: https://github.com/Homebrew/discussions/discussions/673#discussioncomment-329689)
Finally I manually started /opt/homebrew/Cellar/ppsspp/1.11.3_1/PPSSPPSDL.app
. And pinned the running app to the dock, for quick access, via right clicking on its dock icon.
So I'm currently enjoying the game by means of an "unofficial" homebrew build.
This could simply be broken Vertex Cache which I guess affects OGL on 1.12.3. You can try artifacts on the bottom of this: https://github.com/hrydgard/ppsspp/actions/runs/1462943557 which is latest and includes macOS build. Or simply disable vertex cache.
@LunaMoo thanks for trying to help.
I tried 1.12.3 with and without vertex cache on Vulkan and OpenGL
And I tried the artifact 1.12.3-232-gb6e7fe1aa only with vertex cache enabled, because the option cannot be disabled there, also on both Vulkan and OpenGL.
The result for both of these mentioned versions is, that characters and vehicles are still invisible, except for tiny time spans.
If you can compile, it'd be very interesting if you could try a few more builds. See here: https://github.com/hrydgard/ppsspp/wiki/How-to-bisect-to-find-what-broke-a-game
This does seem similar to #14514, but that happens in v1.11.3. Both seem likely to be graphics driver bugs. If we can know what change introduced this issue, it might help lead to a fix for both.
I don't have the affected devices, but the more you can narrow the haystack down via bisecting, the better. There are about 1700 commits between v1.11.3 and v1.12.3, but with 4 tests you could narrow that down to 100 commits. With a total of 7-8 tests you could narrow it down to 10 or less. That would probably point to where the problem is coming from.
-[Unknown]
Confirmed, and it's not working on any version I'm able to build on M1. Very weird!
If you could post a framedump where it's definitely happening, I'd like to check what's rendering differently for the objects that aren't visible. Since I won't reproduce the problem (don't have M1), a screenshot showing that same framedump (just load it as if it were a game) would be necessary.
The simpler the scene, the better.
It seems like there must be something unique about those objects.
Software renderer ought to be able to validate that it's not CPU related at all. I assume things will show fine there, if slow.
-[Unknown]
Need to fix our framedumping to handle GTA's rather unique presentation method first, but will try to get around to it soon.
I just took three framedumps of the bus arriving in the intro scene. On OpenGL, Vulkan and software-rendering. The Bus is invisible in all three of them.
EDIT: on PPSSPP 1.12.3, M1 Mac
This does happen with the software renderer, so whatever the issue, it goes deep...
I don't know if it can help but i have installed a older version on my iphone (1.9.3) and two of GTA games was fixed , no more invisible characters or graphical issues.
Yeah the problem seems to be unique to ARM builds of PPSSPP for Mac. Those older builds would have been x86-64 builds running under Rosetta, which for some reason is not affected. Thanks for reporting though!
I am also experiencing the invisible characters on Intel based Mac, on Big Sur 11.6.1. Tried 1.12.1-3. all does it.
Maybe a noob question, but how can I rollback to 1.10.3?
It should be possible to rollback using Xcode. As far as Brew, it seems like you're meant to clone an older version of the Brew repo (which I've done before but is annoying and slow.) You'd basically be trying to run an x86 build through Rosetta, which works.
Since updating the unit test and headless tests, can anyone with M1 try running them and report if any fail? That might give us a much more controlled example of failure that would be easier to fix.
Also, as I remember, it seems more or less like a CPU/math issue. Some ideas to validate:
My goal is to get a frame dump of EXACTLY the same scene from the PC and M1 devices, so we have two frame dumps that ought to be identical, but aren't. Then we can compare and at least know where they're different - matrix values, vertices, something else? But it is important that they would be identical if it weren't for this bug, or it won't help.
Looking at this: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
Arguments are slightly different, but it doesn't seem to impact our sin/cos calls. Probably not related anyway, since this occurs even with all jits off.
-[Unknown]
Another idea / question: if we compile a Debug build (not RelWithDebInfo), or otherwise compile with -O0, does anything change?
-[Unknown]
Well the issue is that framedumps are a mess in this game, making it hard to judge. I'm gonna use the mentioned methods to look into it again soon, though, and possibly fix the framedump problem too (and yup, also check unoptimized)
Well, the framedumps from the "EDIT: on PPSSPP 1.12.3, M1 Mac" comment do work fine (they have two frames in them, but it's hard to tell exactly why - maybe it's actually rendering two frames per vblank right now...) I do think we should fix it to identify "frames" better, but it's hard because of the various presentation methods...
Ideally I just want a PC version of that same scene. I see a lot of cases in that dump of identical verts in prim calls, but I have seen games do that sometimes to null out verts.
-[Unknown]
Hey, since I am also having this issue (and the similiar issue on GTA Vice City Stories) I would like to help finding the cause. Can I do something for you in order to get you a step forward in this issue? Or has the issue already been identified?
FYI: I do have a Windows PC, M1 and can also test the OpenEmu version against it. Just tell me how long a scene should be and which versions/settings you need the emulator to be on.
Hey, so far I found out:
See file attached to test it. recordings.zip Next up is running the unit tests
Hello, attached the unittest result.
If you need anything else, just ask me. I can for example give you a reproducable dump (With 1 Weapon the upper body of the character always disappears when aiming on a target and always reappears when stopping to aim)
Windows, green shirt drawn at 1086/1983: Bone 1 - Bone 5 are NAN on M1, but shouldn't be.
They are explicitly set to NAN in the dump at 09FFA63C, using 2B 7FC000
, which is a NAN. On Windows, these are not NAN, and that's the cause of the missing rendering. So it's pretty clear something is resulting in NAN that shouldn't, and likely it's some CPU calculation to generate this display list.
Since the CPU core doesn't affect it, I think an option might be:
WriteVector
per instruction name using this branch: https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:m1-nan-log-[Unknown]
That's a really good idea! Missed that you made a branch first so made my own, https://github.com/hrydgard/ppsspp/compare/master...interpreter-nan-check, might combine them. Tested mine on PC to make sure no spurious nans are logged (had to add some exclusions), will test on Mac and track this down tonight.
Alright, had to just have a quick look before doing other stuff today. first hit is Int_VScl which doesn't feel like a likely root cause. Could also be one of the scalar VFPU instructions, they would not be caught by WriteVector... I'll mess around more later.
Outside transfer/lsu, everything should use WriteVector or WriteMatrix (to apply prefixes.) Maybe worth looking at matrix too, though. Perhaps more likely there (like that compiler bug for LBP back in the day...)
Vscl does seem pretty unlikely. Maybe FPU is more likely than I thought...
-[Unknown]
Mac NaN stats on that first scene with the mob boss:
new frame, vfpu nans produced:
That's a lot. I do think many are coming from an unused fourth channel, especially vh2f / lv.q.
PC:
new frame, vfpu nans produced:
Yeah, that confirms my theory about vh2f/lv.q, but doesn't quite help pinpointing the source... Six candidates (in addition to matrices and FPU), hm.
In non-surprising news, USE_VFPU_SQRT doesn't fix it.
I pushed a new version of the branch which does matrix values and tries to avoid logging for nan propagation (didn't really test it, though.)
If VScl is the culprit, maybe it's related to swizzle? In that case, this gets tricky but I'm thinking what I added in the latest commit. That will log the Vscl # per frame that resulted in NAN. If we can catch it right after the save state, we could do something like if (my_isnan(d[i]) || vsclNum == 12345) {
for a cleaner compare.
https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:m1-nan-log
Maybe we'll find it has something to do with denormals?
-[Unknown]
I had to add null checks in ReadVector, like this: if (my_isnan(rd[i]) && nanState)
since some methods passed in nullptr.
Also now I'm properly using a PC savestate on mac, which I should have done from the beginning. Even so:
I feel vrsq is suspicious since it's commonly used to produce a multiplier later used by vscl... But then turning on the accurate version should have helped. And still all those from lv.q...
PC output:
`new frame, vfpu nans produced:
Mac output:
* lv.q = 5460
* vh2f = 5
* vrsq = 34
* vscl = 68
Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000)
Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000)
Vscl #24 produced NAN - 0.008423 * nan (3c0a0000 * 7fc00000)
Vscl #24 produced NAN - 0.006859 * nan (3be0c000 * 7fc00000)
Vscl #24 produced NAN - -0.094788 * nan (bdc22000 * 7fc00000)
Vscl #24 produced NAN - -0.995605 * nan (bf7ee000 * 7fc00000)
Vscl #25 produced NAN - 0.011337 * nan (3c39c000 * 7fc00000)
Vscl #25 produced NAN - 0.006248 * nan (3bccc000 * 7fc00000)
Vscl #25 produced NAN - -0.101562 * nan (bdd00000 * 7fc00000)
Vscl #25 produced NAN - -0.994629 * nan (bf7ea000 * 7fc00000)
Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
Vscl #26 produced NAN - nan * 1.000000 (7fc00000 * 3f800000)
Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
Vscl #27 produced NAN - nan * nan (7fc00000 * 7fc00000)
Vscl #31 produced NAN - 0.000000 * inf (00000000 * 7f800000)
Vscl #31 produced NAN - 0.000000 * inf (00000000 * 7f800000)
Vscl #32 produced NAN - 0.008133 * nan (3c054000 * 7fc00000)
Vscl #32 produced NAN - 0.006638 * nan (3bd98000 * 7fc00000)
....
Ah sorry (I did say it was untested...)
Vscl #23 produced NAN - 0.000000 * inf (00000000 * 7f800000)
makes me think that some vrsq
resulted in zero or infinity and maybe that's the whole problem. It could be something else is producing one of those values, and not actually NAN.
You tried USE_VPFU_SQRT
, though, hm. Could it be clz32_nonzero
also? But how could both paths be broken the same way? So I guess that's indeed a pretty unlikely candidate. We could give it the same logging treatment, I guess, to see if it gets different inputs.
None of the Vscl outputs seem incorrect, at least, so it seems like a victim not a perpetrator.
Since we're using the PC save state, it shouldn't be possible that the values are enshrined in RAM, but the lv.q
number implies it's coming from outside. I guess that makes FPU the next most likely suspect...
-[Unknown]
Is it possible to debug this like in Ghost In The Shell here: https://github.com/hrydgard/ppsspp/issues/12519#issuecomment-609075417?
Using WINE won't help because it doesn't happen with Rosetta. Memory breakpoints can be used, but I guess this is a reason to enhance the web debugger and hope web restrictions on non-SSL don't increase.
But yes, there should be a GE command that's setting the matrix values to NaN and if we could trace that back, that'd find the instruction causing this.
Since pspautotests all pass, it'll be really good to capture whatever this is when we find it as a test...
-[Unknown]
FWIW for Ghost in the Shell I used the "native debugger" with the breakpoints and checked the same addresses with the web debugger connected to the Android device for the first difference.
It might be possible to have the breakpoints on Windows while checking the values at the same addresses on the M1 with the web one.
I guess it's highly game based if this is viable tho', I was probably just lucky with Ghost in the Shell eheh :)
This one is still super mysterious but doesn't affect our official releases at this time, so taking off the milestone.
Has anyone been able to try this on an M2 device? Wondering if it's affected differently.
Also, happened to see this and wondered if maybe it could be related: https://github.com/martin-cs/symfpu/issues/5
-[Unknown]
Hm, possibly. The last comment says their tests fails on Intel though, while GTA has no problems on Intel Mac's...
I pushed a new version of the debugging branch: https://github.com/hrydgard/ppsspp/compare/master...unknownbrackets:ppsspp:m1-nan-log?expand=1
This one logs FPU nan production as well as vrsq 0/inf production.
-[Unknown]
I don't now if it is related to this issue, but both LCS and VCS crashing after intro on 1.14.3. (both with Vulcan and OGL) I am using an M1 Mac.
It would help if you could try earlier git builds and find when it started crashing.
-[Unknown]
Hm, no problems getting in-game in a fresh build in either Vulkan or OpenGL, on my Macbook Air M1.
I have just tried with 1.14.4 and still crashing both games. Will check earlier builds. (Other games are OK) Btw I am using an M1 Max with Monterey, but this shouldn't matter if is OK with your M1 Air.
Anyway many thanks for your prompt reply!
1.14-9-g8ed87d48e, Merge pull request #16589 from sum2012/patch #4962 was the last build that I can get in game with invisible bus + characters in the beginning cutscene.
some pictures about the glitch
1.14-9-g8ed87d48e, Merge pull request #16589 from sum2012/patch #4962 was the last build that I can get in game with invisible bus + characters in the beginning cutscene.
Does that mean v1.14-13-g5406c4f97a doesn't work? That didn't really have any changes except a somewhat minor one to the UI. Are you downloading these builds from GitHub or building locally?
We know the glitch still happens, although we don't know why. More pictures won't help at this time. Some math equation is calculating "not a number" as the result instead of the expected number, and it's only happening on M1. But the equation is complex and involves a lot of equations within the game's code, so we're not quite sure what specific step the M1 macs are calculating an unexpected result for.
-[Unknown]
correct, 1.14-13-g5406c4f97a the first one that doesn't work. I am downloading the artifacts from GitHub.
As for the glitch I know that you are aware of the M1 specific problem, that it isn't easy to track back. (only included the pics to remember myself which was the last "working" build)
For some reason, on intel hardware, with the same version of moltenvk obtained from https://github.com/hrydgard/ppsspp/releases/tag/v1.14.4, the issue is observed on the native build but not windows build ran under cxwine 22.0.1 (https://github.com/Gcenx/WineskinServer/releases)
Wow, that is a super interesting result! Rules out that it's ARM64-specific!
What if you use software rendering on that setup?
Just a shot in the dark: could it be the same issue cemu had, regarding Metal alignment requirements?
https://github.com/cemu-project/Cemu/pull/534 https://github.com/cemu-project/Cemu/pull/445
Software rendering enabled
The pairs of floating legs on the final screenshots does resemble https://github.com/hrydgard/ppsspp/issues/14514
With software rendering disabled for a higher resolution rendering
Does #16816 help that, then? This applies the alignment from a quick look at the cemu pull request.
I'm not sure how that'd help the NANs we saw, but maybe this indicates there's some misaligned read involved. Worth trying that branch with hardware transform off, jit off, etc.
-[Unknown]
Bisecting on intel hardware, 0ccc63b43edba37effb7d3b0aac6e88ce9e6791a is where the issue started, while it was issue free up to dbe6658803ff4937c0ea2f3d1ea26c33fed9d31a
So it looks like somehow apple provided compilers just breaks the vfpu sin/cos implementation introduced at https://github.com/hrydgard/ppsspp/pull/14406, at least for intel x86 hardware
Well, from cppreferce: "It is undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union."
I wonder if reinterpret_cast works fine...
Edit: reinterpret_cast is UB too, yeah just memcpy seems fine (or C++20 bit_cast).
Game or games this happens in
ULUS-10041 GTA:Liberty City Stories
What area of the game / PPSSPP
One minute into the game when one watches the intro, it becomes clear that all characters and vehicles are invisible most of the time, and only partly "flash" into existence for very brief periods.
I also tested OpenEmu which uses 1.10.3. And that version is not affected and shows all characters and vehicles correctly. But in OpenEmu one cannot even access any of PPSSPP's settings, so using that "frontend" is rather pointless.
Also you guys do not provide 1.10.3 readily compiled for M1. Brew provides PPSSPP 1.10.3, too, but crashes during installation of PPSSPP. See [#72631].(https://github.com/Homebrew/homebrew-core/issues/72631) And RetroArch for M1 is missing the PPSSPP core entirely.
So the only way to play this game is OpenEmu, which is not configurable and uses 1x-PSP-resolution -.-
What should happen
Characters and vehicles should be clearly visible, when they are on screen.
Logs
No response
Platform
macOS
Mobile phone model or graphics card
M1
PPSSPP version affected
1.12.3
Last working version
1.10.3
Graphics backend (3D API)
OpenGL / GLES and Vulkan
Checklist