AsahiLinux / linux

Linux kernel source tree
Other
2.36k stars 90 forks source link

[TRACKER] GPU rendering issues / app crashes #72

Open asahilina opened 1 year ago

asahilina commented 1 year ago

This is a tracker bug for general GPU issues, like:

When making a comment on this bug, please run the asahi-diagnose command and attach the file it saves to your comment. Please tell us what you were doing when the problem happened, what desktop environment and window system you use, and any other details about the issue.

The purpose of this bug is to collect reports of app issues in one place, so we have somewhere to look when figuring out what to work on. Since the driver is still a work-in-progress and lots of things are not expected to work, please don't expect a timely response to reports. We're working on it!

Before reporting something, please check that the issue has not been reported already. Duplicate reports just clutter up the bug and will be marked as duplicate. --marcan

If you see magenta

Magenta is the error color on Apple GPUs. It is what you get when you sample an uninitialized compressed texture. This often happens with driver bugs that break rendering, but there are also many apps that have bugs that transiently display uninitialized buffer contents. These will often show up as black or transparent on other GPUs or with software rendering, which stands out less but it indicates the same bug.

If you see magenta glitches, please try running the app with ASAHI_MESA_DEBUG=nocompress. If you see the same problems but they are now black, try LIBGL_ALWAYS_SOFTWARE=true to force software rendering. If you get the same results (still black regions where previously there was magenta), then it is likely an app bug or an upstream Mesa bug, not a driver issue.

Another common issue is apps that have rendering feedback loops, which are undefined behavior in OpenGL. These often result in 4x2 pixel shaped corruption regions. You can work around this with ASAHI_MESA_DEBUG=nocompress, which should fix the issue (at least if it wouldn't normally break on all GPUs). This could also be caused by a driver bug, though, so please do report anything that is fixed with nocompress so we can take a look and determine whether it's an app bug or a driver bug!

Known issues

Resolved issues

Issues that aren't driver bugs

asahilina commented 1 year ago

That doesn't look like a bug... keep in mind that the Blur plugin was disabled by default in this update, because it is known broken (see the known issues list). If you expected a blurred background and it's not blurred any more, that's why. You can turn it on again, just be aware that the glitches are a known upstream bug.

DavidBuchanan314 commented 1 year ago

I see the same effect, and it looks buggy to me - it's hard to say exactly why, but the alpha blending seems weird. Here's a test pattern image:

image

And here it is after putting it in my wallpaper, on a white background, and overlaying a panel:

image

I would expect it to look more like this (I overlaid a transparent grey rectangle in GIMP):

image

It seems like the bright areas stay too bright and saturated, or something along those lines (as a total guess, maybe gamma correction happened wrong?)

marcan commented 1 year ago

I see the same thing on an older KDE on Radeon:

image

So it's not a GPU bug, KDE is either deliberately or accidentally doing some funny blending.

mxw39 commented 1 year ago

Hi I'd like to report a Firefox crash issue when opening a specific web page with embedded OpenStreetMap.

Problematic URI: https://hellyhansen.com/store-finder

Firefox version: Mozilla/5.0 (X11; Linux aarch64; rv:109.0) Gecko/20100101 Firefox/111.0

kernel log:

[95062.224374] asahi 406400000.gpu:  (\________/) 
[95062.224390] asahi 406400000.gpu:   |        |  
[95062.224394] asahi 406400000.gpu: '.| \  , / |.'
[95062.224398] asahi 406400000.gpu: --| / (( \ |--
[95062.224402] asahi 406400000.gpu: .'|  _-_-  |'.
[95062.224405] asahi 406400000.gpu:   |________|  
[95062.224409] asahi 406400000.gpu: ** GPU timeout nya~!!!!! **
[95062.224412] asahi 406400000.gpu:   Event slot: 6
[95062.224418] asahi 406400000.gpu:   Timeout count: 1
[95062.224424] asahi 406400000.gpu:   Fault info: FaultInfo {
                   address: 0x5ff7570980,
                   sideband: 0x43,
                   vm_slot: 0x6,
                   unit_code: 0x1,
                   unit: UL1C(
                       0x0,
                   ),
                   level: 0x0,
                   unk_5: 0x0,
                   read: true,
                   reason: Unmapped,
               }
[95062.224454] asahi 406400000.gpu:   Pending events:
[95062.224458] asahi 406400000.gpu:     [6] flags=7 value=0xcb00
[95062.224468] asahi 406400000.gpu:   Halt count: 2
[95062.224472] asahi 406400000.gpu:   Halted: 1
[95062.224476] asahi 406400000.gpu:   Attempting recovery...
[95062.224787] asahi 406400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait

The same web page can be opened in chromium without crashing. Let me know if you need more information about the bug.

lebakassemmerl commented 1 year ago

Hello!

I got a problem concerning firefox which I already reported in a different thread (https://github.com/AsahiLinux/linux/issues/70#issuecomment-1377756957). I thought maybe it went away with the latest asahi update last week but it still persists. Since I reported this problem already in January and I maybe posted it to the wrong issue, I want to give it attention again, maybe you find the time to have a look on it.

Thanks a lot for your effort!

filahf commented 1 year ago

Hello,

I don't know if it's a known issue, but it seems like everything built with three.js is broken in one way or another. Firefox manages to render a subset of the scenes, but with the following console error.

THREE.WebGLRenderer: A WebGL context could not be created. Reason:  WebGL creation failed: 
* tryNativeGL (FEATURE_FAILURE_EGL_CREATE)
* Exhausted GL driver options. (FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS)

Chromium won't render anything at all with following console output:

three.module.js:28101 THREE.WebGLRenderer: A WebGL context could not be created. Reason:  Could not create a WebGL context, VENDOR = 0xffff, DEVICE = 0xffff, Sandboxed = no, Optimus = no, AMD switchable = no, Reset notification strategy = 0x0000, ErrorMessage = OffscreenContext Creation failed, GpuChannelHost creation failed.
onContextCreationError @ three.module.js:28101
getContext @ three.module.js:27666
WebGLRenderer @ three.module.js:27706
init @ webgl_animation_skinning_additive_blending.html:159
(anonymous) @ webgl_animation_skinning_additive_blending.html:66
three.module.js:28101 THREE.WebGLRenderer: A WebGL context could not be created. Reason:  Failed to create a WebGL2 context.
onContextCreationError @ three.module.js:28101
getContext @ three.module.js:27666
WebGLRenderer @ three.module.js:27706
init @ webgl_animation_skinning_additive_blending.html:159
(anonymous) @ webgl_animation_skinning_additive_blending.html:66
2three.module.js:28101 THREE.WebGLRenderer: A WebGL context could not be created. Reason:  Could not create a WebGL context, VENDOR = 0xffff, DEVICE = 0xffff, Sandboxed = no, Optimus = no, AMD switchable = no, Reset notification strategy = 0x0000, ErrorMessage = OffscreenContext Creation failed, GpuChannelHost creation failed.

Let me know if I need to provide any other log entries.

jannau commented 1 year ago

DRM_IOCTL_ASAHI_GEM_BIND failures under kwin after extended desktop use (2 - 3 days)

kwin_plasma after DRM_IOCTL_ASAHI_GEM_BIND caused by an invalid address of 0xfff34000. This appears to exhaustion of the usc_heap. GEM_BIND calls addresses in the VM_SHADER_START..VM_SHADER_END are constantly decreasing with 0x11_0001_0000 in the most recent call.

See the File debug_flags log excerpt in asahi_vm_shader_exhaustion_kwin.log

Possibly related to a dmabuf leak I'll describe in a separate comment.

Device: M1 Max Macbook Pro 14" Kernel: asahi-6.2-12 with debug patches for File.rs mesa: asahi/main from 2023-04-05 with mesa/asahi!27 merged

The issue occurred with unmodified asahi-6.2-12 and asahi-20230321 as well.

Fixed by asahi/mesa!45

asahilina commented 1 year ago

I went through all the issues above and I can't really reproduce anything any more on the latest asahi/mesa main, so hopefully all that should be fixed with the next update if it isn't already!

Edit: Except figma, that's still glitchy.

TophEvich commented 1 year ago

I have a problem with firefox tooltips from wayland session:

If you go with the mouse somewhere on the firefox window that will popup a tooltip, this tooltip will go away (hide) briefly and come back if you move with the mouse out of the firefox window to the plasma shell menubar (for example to switch virtual desktop) . The problem will not happen if firefox is run with LIBGL_ALWAYS_SOFTWARE=1 and does not happen from X11 session

I'm on asahi desktop with firefox 108.1

Screenshot_20221219_111045

asahi-diagnose-20221219-113602.txt

@zanfix @asahilina I think this is just a very old FF bug: https://bugzilla.mozilla.org/show_bug.cgi?id=687344 (I have this quite often when switching between apps, it just stays around until I open the FF window in question again)

asahilina commented 1 year ago

I have a problem with firefox tooltips from wayland session: If you go with the mouse somewhere on the firefox window that will popup a tooltip, this tooltip will go away (hide) briefly and come back if you move with the mouse out of the firefox window to the plasma shell menubar (for example to switch virtual desktop) . The problem will not happen if firefox is run with LIBGL_ALWAYS_SOFTWARE=1 and does not happen from X11 session I'm on asahi desktop with firefox 108.1

Screenshot_20221219_111045

asahi-diagnose-20221219-113602.txt

@zanfix @asahilina I think this is just a very old FF bug: https://bugzilla.mozilla.org/show_bug.cgi?id=687344 (I have this quite often when switching between apps, it just stays around until I open the FF window in question again)

It sounds like a Firefox on XWayland bug since the XINPUT2 thing was mentioned. Firefox defaults to native Wayland these days on Asahi, so it probably isn't relevant any more.

zanfix commented 1 year ago

I have a problem with firefox tooltips from wayland session: If you go with the mouse somewhere on the firefox window that will popup a tooltip, this tooltip will go away (hide) briefly and come back if you move with the mouse out of the firefox window to the plasma shell menubar (for example to switch virtual desktop) . The problem will not happen if firefox is run with LIBGL_ALWAYS_SOFTWARE=1 and does not happen from X11 session I'm on asahi desktop with firefox 108.1

Screenshot_20221219_111045

asahi-diagnose-20221219-113602.txt

@zanfix @asahilina I think this is just a very old FF bug: https://bugzilla.mozilla.org/show_bug.cgi?id=687344 (I have this quite often when switching between apps, it just stays around until I open the FF window in question again)

It sounds like a Firefox on XWayland bug since the XINPUT2 thing was mentioned. Firefox defaults to native Wayland these days on Asahi, so it probably isn't relevant any more.

I can not reproduce this anymore

WhatAmISupposedToPutHere commented 1 year ago

Kde panels seem to take a second or two between clicking the button and the animation starting to play.

asahi-diagnose-20230512-121546.txt

WhatAmISupposedToPutHere commented 1 year ago

Opening htop in konsole and then moving the selection around results in visual corruption (thin cyan lines) IMG_0381

sulix commented 1 year ago

FYI, the omnispeak project (my re-implementation of the Commander Keen in Goodbye Galaxy games) is currently broken on mesa-asahi-edge, though works with LIBGL_ALWAYS_SOFTWARE=1 and the macOS OpenGL driver. It seems like 1D texture sampling is broken (the SDL_Renderer backend, which does palette conversion in software rather than with 1D textures works fine).

Applying asahi: Lower 1D to 2D on top of the current mesa-asahi-edge branch seems to fix it here, though, so I imagine it'll start working with a future update anyway.


Edit: As expected, this works fine as of today's OpenGL 3.1 update to mesa-asahi-edge.

iaguis commented 1 year ago

I see an issue in Google Sheets with Firefox. Whenever I scroll the text gets blurry and corrupted. For example, after scrolling a bit in this public spreadsheet I see this:

image

asahi-diagnose-20230516-112006.txt

I'm using sway.

i509VCB commented 1 year ago

When using vscodium (without tiling the window to the side of the screen but putting the window on a side) and then moving the cursor up and down along the side of the window, vertical blue and white line artifacts appear between the window border and the content. You'll see the bottom half of the window does not have these artifacts.

This only happens with vscodium (alacritty and firefox for example has no such issue).

Screenshot_20230606_172533

asahi-diagnose-20230606-173213.txt

marcan commented 1 year ago

When using vscodium (without tiling the window to the side of the screen but putting the window on a side) and then moving the cursor up and down along the side of the window, vertical blue and white line artifacts appear between the window border and the content. You'll see the bottom half of the window does not have these artifacts.

Does that happen at 100% display scale? If not, it's a fractional scaling issue, unrelated to the GPU driver, as mentioned in the issue description.

artun42 commented 1 year ago

this site's background is broken with latest driver edge driver (3.1). using stock settings of the site. (best seen in chillsynth station)https://nightride.fm

marcan commented 1 year ago

Opening htop in konsole and then moving the selection around results in visual corruption (thin cyan lines)

That's fractional scaling, as explained in the issue description, not a GPU issue.

artun42 commented 1 year ago

this shader crashes firefox (latest opengl update to the driver, 3.1) https://www.shadertoy.com/view/4dK3zc

mkurz commented 1 year ago

https://www.google.com/maps and/or https://earth.google.com/ just crashed chromium (had them both open in two tabs). Specially maps is very very slugish and does not really react well when moving the maps around, zoom in/out or activating the 3d satellite mode and also zooming/in out, moving around, changing the view etc. earth actually works quite well, but when I moved around/zoomed around quite fast it crashed chromium.

Here are the error logs: error.log (I think also some relevant things are logged at the end of that log)

According to the logs it's not just the gpu but maybe also a wayland related problem?

asahi-diagnose-20230608-012524.txt

janrinze commented 1 year ago

With latest built kernel and mesa (Glanzmann Debian build sources) WebGL ( https://threejs.org/examples/#webgl_animation_keyframes ) misbehaves in chromium browser on the Mac Studio Ultra. There are strange noisy blocks and the image does not get properly updated. Hopefully it is just my setup. Previous mesa version worked ok.

inodentry commented 1 year ago

https://github.com/bevyengine/bevy/issues/8790 asahi-diagnose-20230608-130400.txt

iaguis commented 1 year ago

Nautilus is very pink since the last update:

image

asahi-diagnose-20230608-192908.txt

Using sway.

janrinze commented 1 year ago

Rendering issues seem to be related to using 4K monitors. Is that what (most) people who report issues with the latest mesa update use?

mattnolan001 commented 1 year ago

I'm seeing graphics issues when using Emacs on linux-asahi-edge ('standard' installation on an M2 Mac, last updated yesterday, issues present before and after the recent edge updata). I'm unsure whether they are GPU related but will post examples here in case helpful. Let me know if more information is useful or if I should post elsewhere.

Example. Rearranged text with colours (see bottom near the cursor). Screenshot_ScrambledText

Example after minimising and then maximising the window - text is now correctly arranged (compare last few lines of the window). Screenshot_AfterMaximize

dylanchapell commented 1 year ago

I have been experiencing instability on Firefox since the GPU driver update. Here is a video of the visual effects that have taken over Firefox twice in the last week. Colored blocks appearing randomly over the window, replacing elements. Scrambled text, etc. This time, after going on for ~30s Firefox crashed. I have also been experiencing random crashes of Firefox about once a day (without the visual distortion).

I am using a mostly stock setup with Wayland and KDE Plasma on Arch. I also have done the process to install Widevine DRM so that could be interfering with Firefox. I have attached the results of asahi-diagnose, which I ran yetrday right after Firefox crashed. Let me know if there is any more information I can provide. Thank you.

Edit: I am on an M1 air btw asahi-diagnose-20230608-152014.txt

marcan commented 1 year ago

I'm seeing graphics issues when using Emacs on linux-asahi-edge ('standard' installation on an M2 Mac, last updated yesterday, issues present before and after the recent edge updata). I'm unsure whether they are GPU related but will post examples here in case helpful. Let me know if more information is useful or if I should post elsewhere.

Emacs does not use the GPU, it is a textmode app. You need to tell us about your terminal emulator and compositor (and other environment details like display scale) so we can reproduce it.

alexferro commented 1 year ago

Since the update to 6.3.0-asahi-7-1-edge (and the other packages from the same day), I've had a couple of instances of firefox (and maybe parts of KDE Shell) hang, and when I check logs, I see processes stuck inside of drm_release and asahi_queue/drop_in_place. Oddly enough, today's hang appears to be in a process that is barely alive such that I can only find it from the kernel log messages. Last time it still was listed as a defunct firefox process, although I have one of those too, but without the stack in drm_release.

As a note, I have been occasionally using the new sleep changes and use the battery limiter feature that is new as well in this update, but I know the machine did not go to sleep when this happened, and I think the battery state didn't change, if that's even relevant.

asahi-diagnose-20230610-231955.txt

cat /proc/107964/stack
[<0>] drm_sched_entity_kill.part.0+0x50/0x310
[<0>] drm_sched_entity_fini+0x20/0x12c
[<0>] drm_sched_entity_destroy+0x24/0x34
[<0>] _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi5queue13QueueG13V12_3EBK_+0xb0/0x26c [asahi]
[<0>] _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi4file4FileEBK_+0x138/0x170 [asahi]
[<0>] _RINvNtNtCs2F0HA5R6vfy_6kernel3drm4file18postclose_callbackNtNtCshJ1PJqkpkmf_5asahi4file4FileEBY_+0x1c/0x38 [asahi]
[<0>] drm_file_free+0x19c/0x238
[<0>] drm_release+0xb8/0x170
[... continues normally]
mattnolan001 commented 1 year ago

I'm seeing graphics issues when using Emacs on linux-asahi-edge ('standard' installation on an M2 Mac, last updated yesterday, issues present before and after the recent edge updata). I'm unsure whether they are GPU related but will post examples here in case helpful. Let me know if more information is useful or if I should post elsewhere.

Emacs does not use the GPU, it is a textmode app. You need to tell us about your terminal emulator and compositor (and other environment details like display scale) so we can reproduce it.

this level of os detail is new to me so apologies if anything inaccurate. Let me know if anything else would help.

terminal emulator (output of echo $TERM): xterm-256color display scale: 150% or 100% (issues present with either) compositor: wl_compositor window manager: Kwin, Xwayland mode

In the kde infor centre there is information about OpenGL (EGL), OpenGL (GLX) and Vulkan. OpenGL don't have any errors. Under Vulkan there is the following:

ERROR: [Loader Message] Code 0 : vkCreateInstance: Found no drivers! Cannot create Vulkan instance. This problem is often caused by a faulty installation of the Vulkan driver or attempting to use a GPU that does not support Vulkan. ERROR at /build/vulkan-tools/src/Vulkan-Tools-1.3.245/vulkaninfo/vulkaninfo.h:677:vkCreateInstance failed with ERROR_INCOMPATIBLE_DRIVER

asahilina commented 1 year ago

I think marcan thought you were running Emacs in a terminal, but I think you meant under X11...

So I had a suspicion this was another rendering loop (like the SuperTuxKart issue) but apparently apitrace doesn't like tracing the X server... but I added an assert to Mesa and scrolling in Emacs hits it, so I think that's it. I can't say for sure yet, but I think this is an Xorg Glamor bug...

Edit: Turns out it's a driver bug. We were advertising texture barriers but we shouldn't...

asahilina commented 1 year ago

this site's background is broken with latest driver edge driver (3.1). using stock settings of the site. (best seen in chillsynth station)https://nightride.fm

I can't reproduce that on Asahi, it looks fine to me... but it actually looks broken on an X86 machine with a Radeon... so if there is a problem I think it's not in our driver ^^;;

asahilina commented 1 year ago

@iaguis

Nautilus is very pink since the last update:

asahi-diagnose-20230608-192908.txt

Using sway.

I can't reproduce that with sway... ;;

Is there anything else unique about your setup to help reproduce it?

asahilina commented 1 year ago

With latest built kernel and mesa (Glanzmann Debian build sources) WebGL ( https://threejs.org/examples/#webgl_animation_keyframes ) misbehaves in chromium browser on the Mac Studio Ultra. There are strange noisy blocks and the image does not get properly updated. Hopefully it is just my setup. Previous mesa version worked ok.

@janrinze I also can't reproduce that...

janrinze commented 1 year ago

@asahilina did you try to reproduce on a Mac Studio Ultra with a 4k monitor?

I have done a clean install with Arch Linux (default settings, update packages and install -edge kernel and mesa.)

asahilina commented 1 year ago

@janrinze I don't have a 4K monitor, but I tried on the Ultra making the Firefox window larger than the screen to try to get a similar rendering size and I don't see anything wrong...

janrinze commented 1 year ago

@asahilina thanks for testing. I will try to get my Ultra replaced. must be some hardware issue.

SuperKenVery commented 1 year ago

I also have the pink problem and I can trigger it quite stably. java -jar -Dsun.java2d.opengl=true -Dsun.java2d.opengl.fbobject=true <whatever_java_gui_app>

The fbobject option should default to true so java -jar -Dsun.java2d.opengl=true rars1_6.jar should do the same.

It was fine before the last update. It is fine with -Dsun.java2d.opengl=false. I have seen the issue in non-java apps but it seems it's the easiest to trigger in java.

After triggering it in java other apps and mouse cursors sometimes get pink too.

And I have the GPU timeouts too!

asahi-diagnose-20230614-185320.txt tiny note: the time of the asahi diagnose is in UTF+8

iaguis commented 1 year ago

@iaguis

Nautilus is very pink since the last update: asahi-diagnose-20230608-192908.txt Using sway.

I can't reproduce that with sway... ;;

Is there anything else unique about your setup to help reproduce it?

I created a new user and with default settings it looks fine. However, I can reproduce it if I start nautilus with themes "Adwaita-dark" or "Adwaita":

GTK_THEME=Adwaita-dark nautilus

Also, while debugging this I noticed I've been getting GPU timeouts when the pink stuff appears. Also, sometimes everything freezes although I can still SSH to the machine.

Logs ``` Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: FWLog: AGFHALWaitForIdle is NOT Implemented for G14 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Event slot: 3 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 0 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60ff4e9dc0, sideband: 0x3c, vm_slot: 0x2, unit_code: 0x21, unit: UL1C( 0x2, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: [3] flags=7 value=0x26700 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Halt count: 1 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Event slot: 3 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 1 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60ff34ddc0, sideband: 0x3c, vm_slot: 0x2, unit_code: 0x1, unit: UL1C( 0x0, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: [3] flags=7 value=0x2a100 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Halt count: 2 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:35:00 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Event slot: 11 Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 2 Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60ff415dc0, sideband: 0x3c, vm_slot: 0x6, unit_code: 0x11, unit: UL1C( 0x1, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: [11] flags=7 value=0x26700 Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Halt count: 3 Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:35:33 locke-m2 kernel: asahi 206400000.gpu: |________| ``` ... ``` Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Event slot: 27 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 168 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60fd925d80, sideband: 0x3c, vm_slot: 0xe, unit_code: 0x21, unit: UL1C( 0x2, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: [27] flags=7 value=0x5a100 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Halt count: 169 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Event slot: 27 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 169 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60fd52dd80, sideband: 0x3c, vm_slot: 0xe, unit_code: 0x21, unit: UL1C( 0x2, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: [27] flags=7 value=0x5ab00 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Halt count: 170 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:39:27 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: Event slot: 25 Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 170 Jun 14 12:39:28 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x1100000000, sideband: 0x3d, vm_slot: 0xd, unit_code: 0x81, unit: UL1C( 0x8, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } ``` ... ``` Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Event slot: 41 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 171 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60ff00ddc0, sideband: 0x3c, vm_slot: 0x15, unit_code: 0x81, unit: UL1C( 0x8, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: [41] flags=7 value=0x18600 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Halt count: 172 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: (\________/) Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: | | Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: '.| \ , / |.' Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: --| / (( \ |-- Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: .'| _-_- |'. Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: |________| Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: ** GPU timeout nya~!!!!! ** Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Event slot: 41 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Timeout count: 172 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Fault info: FaultInfo { address: 0x60ff46ddc0, sideband: 0x3c, vm_slot: 0x15, unit_code: 0x21, unit: UL1C( 0x2, ), level: 0x1, unk_5: 0x0, read: true, reason: Unmapped, } Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Pending events: Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: [41] flags=7 value=0x1c000 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Halt count: 173 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Halted: 1 Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: Attempting recovery... Jun 14 12:40:59 locke-m2 kernel: asahi 206400000.gpu: FWLog: ERROR: PIO poll from agfPollFenderReg timeout after 250us [type:0 reg:0x10080 expected:0x0 got:0x3 max:250us], continue wait Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7498: set_run_mode_safe: deferring: 4 -> 1 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7403: virtual IOMFBStatus IOMFB::UPPipeDCP_H13P::ready_for_run_mode_change(IOMFB::AppleRegisterStream *): initiating deferred run mod Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPTSQManager.cpp:150: IOMFBStatus IOMFB::UPTSQManager::power_down_M3(IOMFB::AppleRegisterStream *, UPTSQManager::ModeChangeWaiter *): request mode ch Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPTSQManager.cpp:162: IOMFBStatus IOMFB::UPTSQManager::power_down_M3(IOMFB::AppleRegisterStream *, UPTSQManager::ModeChangeWaiter *): disabling M3 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7476: set_run_mode_safe: no need to defer: 1 -> 0 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:402: IOMFB: VideoInterfaceIOAV::did_power_off: m_power_ctrl->setPower( 0 ) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:142: IOMFB: IOAVVideoInterface terminated Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:916: void VideoInterfaceIOAV::unplug_gated(IOAVVideoInterface *): display HPD removed ioav=0xffffffff40e6fc20 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: DCPDPDevice.cpp:3360: [AFK]registering callback 0xffffffff40e362d8 saved as 0xffffffff40e67e80 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: dcp_poweroff() done Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: dcp_poweron() starting Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7476: set_run_mode_safe: no need to defer: 0 -> 1 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: ock_Tunables_v1.cpp:377: IOMFB: Writing tunables with target 3 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7476: set_run_mode_safe: no need to defer: 1 -> 2 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:225: bool VideoInterfaceIOAV::open_ioav_gated(): IOAVVideoInterface open failed Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:284: IOMFB: VideoInterfaceIOAV::power_on: m_power_ctrl->setPower( 1 ) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: AppleDCPDPTX.cpp:361: [AFK]powering nub 0x83a9a0 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: set_digital_out_mode(color:1 timing:2) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: nifiedPipeline.cpp:6553: set_digital_out_mode returned 8000000b Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: ileCDIDPDisplay.cpp:119: [AFK]SAC enable: DP=1 CDI=1 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: DCPDP13Service.cpp:55: [AFK]version: 13 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: DCPDPService.cpp:69: [AFK]version: 13 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: DCPDPDevice.cpp:3360: [AFK]registering callback 0xffffffff40f7b298 saved as 0xffffffff40e1bf90 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: ctVideoInterface.cpp:50: [AFK]creating SimpleVideoInterface (0xffffffff40e46c80) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: tVideoInterface.cpp:143: [AFK]DCPAVSimpleVideoInterface (0xffffffff40e46c80) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:131: IOMFB: IOAVVideoInterface published Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: eoInterfaceIOAV.cpp:558: IOMFBStatus VideoInterfaceIOAV::plug_gated(IOAVVideoInterface *): display HPD asserted ioav=0xffffffff40e46c80 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7498: set_run_mode_safe: deferring: 2 -> 4 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipeDCP_H13P.cpp:7403: virtual IOMFBStatus IOMFB::UPPipeDCP_H13P::ready_for_run_mode_change(IOMFB::AppleRegisterStream *): initiating deferred run mod Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: M3Hal_v1.cpp:196: IOMFB: load APT M3 IMem : size 0x71d4 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: M3Hal_v1.cpp:196: IOMFB: load APT M3 DMem : size 0x6e04 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: CAHandler.cpp:163: IOMFB load_ca_data: Unrecognized data version 0 (expected 1 or 2) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPPipe.cpp:3456: IOMFB read_pmu_data_sync: pmu ram read error (e00800d8) Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: PPipeDCP_H13P.cpp:10100: IOMFB Int RTBandwidth: program_M3_rt_config: Using Legacy, scratch 23b730014 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: PPipeDCP_H13P.cpp:10120: IOMFB Int RTBandwidth: using legacy. Doorbell 23bc3c000, bit 2 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPTSQManager.cpp:133: IOMFB: clearing M3 reset Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPTSQ_Hal_v1.cpp:250: IOMFB: timebase_offset = 4 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: UPTSQManager.cpp:105: IOMFB: switch to normal mode succeeded Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: ock_PCC2DLTM_v1.cpp:796: IOMFBStatus IOMFB::UPBlock_PCC2DLTM_v1::set_mcpu_power(IOMFB::AppleRegisterStream *, bool) Loading M3 Hal Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: M3Hal_v1.cpp:196: IOMFB: load LTM M3 IMem : size 0x5894 Jun 14 12:41:03 locke-m2 kernel: apple-dcp 231c00000.dcp: RTKit: syslog message: M3Hal_v1.cpp:196: IOMFB: load LTM M3 DMem : size 0xbd24 Jun 14 12:42:10 locke-m2 kernel: sched: RT throttling activated ``` ``` Jun 14 12:49:10 locke-m2 kernel: INFO: task sway:disk$0:2655 blocked for more than 122 seconds. Jun 14 12:49:10 locke-m2 kernel: Tainted: G S 6.3.0-asahi-7-1-edge-ARCH #2 Jun 14 12:49:10 locke-m2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 14 12:49:10 locke-m2 kernel: task:sway:disk$0 state:D stack:0 pid:2655 ppid:2636 flags:0x0000000c Jun 14 12:49:10 locke-m2 kernel: Call trace: Jun 14 12:49:10 locke-m2 kernel: __switch_to+0xc4/0x118 Jun 14 12:49:10 locke-m2 kernel: __schedule+0x230/0x634 Jun 14 12:49:10 locke-m2 kernel: schedule+0x58/0xf0 Jun 14 12:49:10 locke-m2 kernel: schedule_timeout+0xe8/0xf4 Jun 14 12:49:10 locke-m2 kernel: wait_for_completion+0xbc/0x15c Jun 14 12:49:10 locke-m2 kernel: drm_sched_entity_kill.part.0+0x50/0x310 Jun 14 12:49:10 locke-m2 kernel: drm_sched_entity_fini+0x20/0x12c Jun 14 12:49:10 locke-m2 kernel: drm_sched_entity_destroy+0x24/0x34 Jun 14 12:49:10 locke-m2 kernel: _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi5queue13QueueG14V12_4EBK_+0xb0/0x26c [asahi] Jun 14 12:49:10 locke-m2 kernel: _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi4file4FileEBK_+0x138/0x170 [asahi] Jun 14 12:49:10 locke-m2 kernel: _RINvNtNtCs2F0HA5R6vfy_6kernel3drm4file18postclose_callbackNtNtCshJ1PJqkpkmf_5asahi4file4FileEBY_+0x1c/0x38 [asahi] Jun 14 12:49:10 locke-m2 kernel: drm_file_free+0x19c/0x238 Jun 14 12:49:10 locke-m2 kernel: drm_release+0xb8/0x170 Jun 14 12:49:10 locke-m2 kernel: __fput+0x78/0x25c Jun 14 12:49:10 locke-m2 kernel: ____fput+0x10/0x1c Jun 14 12:49:10 locke-m2 kernel: task_work_run+0x7c/0xd8 Jun 14 12:49:10 locke-m2 kernel: do_exit+0x1a4/0x510 Jun 14 12:49:10 locke-m2 kernel: do_group_exit+0x34/0x90 Jun 14 12:49:10 locke-m2 kernel: get_signal+0x734/0x764 Jun 14 12:49:10 locke-m2 kernel: do_signal+0x7c/0x1d8 Jun 14 12:49:10 locke-m2 kernel: do_notify_resume+0xc0/0x1cc Jun 14 12:49:10 locke-m2 kernel: el0_svc+0x110/0x11c Jun 14 12:49:10 locke-m2 kernel: el0t_64_sync_handler+0xf4/0x120 Jun 14 12:49:10 locke-m2 kernel: el0t_64_sync+0x190/0x194 ... Jun 14 12:51:13 locke-m2 kernel: INFO: task sway:disk$0:2655 blocked for more than 245 seconds. Jun 14 12:51:13 locke-m2 kernel: Tainted: G S 6.3.0-asahi-7-1-edge-ARCH #2 Jun 14 12:51:13 locke-m2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 14 12:51:13 locke-m2 kernel: task:sway:disk$0 state:D stack:0 pid:2655 ppid:2636 flags:0x0000000c Jun 14 12:51:13 locke-m2 kernel: Call trace: Jun 14 12:51:13 locke-m2 kernel: __switch_to+0xc4/0x118 Jun 14 12:51:13 locke-m2 kernel: __schedule+0x230/0x634 Jun 14 12:51:13 locke-m2 kernel: schedule+0x58/0xf0 Jun 14 12:51:13 locke-m2 kernel: schedule_timeout+0xe8/0xf4 Jun 14 12:51:13 locke-m2 kernel: wait_for_completion+0xbc/0x15c Jun 14 12:51:13 locke-m2 kernel: drm_sched_entity_kill.part.0+0x50/0x310 Jun 14 12:51:13 locke-m2 kernel: drm_sched_entity_fini+0x20/0x12c Jun 14 12:51:13 locke-m2 kernel: drm_sched_entity_destroy+0x24/0x34 Jun 14 12:51:13 locke-m2 kernel: _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi5queue13QueueG14V12_4EB> Jun 14 12:51:13 locke-m2 kernel: _RINvNtCslf53ahwCktD_4core3ptr13drop_in_placeNtNtCshJ1PJqkpkmf_5asahi4file4FileEBK_+0x138/0x> Jun 14 12:51:13 locke-m2 kernel: _RINvNtNtCs2F0HA5R6vfy_6kernel3drm4file18postclose_callbackNtNtCshJ1PJqkpkmf_5asahi4file4Fil> Jun 14 12:51:13 locke-m2 kernel: drm_file_free+0x19c/0x238 Jun 14 12:51:13 locke-m2 kernel: drm_release+0xb8/0x170 Jun 14 12:51:13 locke-m2 kernel: __fput+0x78/0x25c Jun 14 12:51:13 locke-m2 kernel: ____fput+0x10/0x1c Jun 14 12:51:13 locke-m2 kernel: task_work_run+0x7c/0xd8 Jun 14 12:51:13 locke-m2 kernel: do_exit+0x1a4/0x510 Jun 14 12:51:13 locke-m2 kernel: do_group_exit+0x34/0x90 Jun 14 12:51:13 locke-m2 kernel: get_signal+0x734/0x764 Jun 14 12:51:13 locke-m2 kernel: do_signal+0x7c/0x1d8 Jun 14 12:51:13 locke-m2 kernel: do_notify_resume+0xc0/0x1cc Jun 14 12:51:13 locke-m2 kernel: el0_svc+0x110/0x11c Jun 14 12:51:13 locke-m2 kernel: el0t_64_sync_handler+0xf4/0x120 Jun 14 12:51:13 locke-m2 kernel: el0t_64_sync+0x190/0x194 ```
mattnolan001 commented 1 year ago

I think marcan thought you were running Emacs in a terminal, but I think you meant under X11...

So I had a suspicion this was another rendering loop (like the SuperTuxKart issue) but apparently apitrace doesn't like tracing the X server... but I added an assert to Mesa and scrolling in Emacs hits it, so I think that's it. I can't say for sure yet, but I think this is an Xorg Glamor bug...

Edit: Turns out it's a driver bug. We were advertising texture barriers but we shouldn't...

Great, thanks for checking it out! Will there be a fix? If it's not an Asahi issue can you give some more pointers about how to fix?

asahilina commented 1 year ago

@janrinze OMG no, we haven't eliminated the driver yet! Sorry, I wasn't paying attention and I had the wrong environment... I reproduced it now, looks like it's a clustering issue on Pro/Max/Ultra machines. For now you can run with ASAHI_MESA_DEBUG=nocluster to work around it \^\^

janrinze commented 1 year ago

@asahilina : Thanks for double checking! Using ASAHI_MESA_DEBUG=nocluster works! So relieved that it's not hardware issues.

asahilina commented 1 year ago

this shader crashes firefox (latest opengl update to the driver, 3.1) https://www.shadertoy.com/view/4dK3zc

@artun42 This is a missing feature in the driver (register spilling). Basically, that shader is too complicated to work right now.

asahilina commented 1 year ago

The Java stuff looks like Java bugs. If you run it with LIBGL_ALWAYS_SOFTWARE=true you still get black regions instead of magenta and sometimes broken window contents. With asahi it shows up magenta, since that is the error color for uninitialized compressed buffers on Apple GPUs. You can get black instead with ASAHI_MESA_DEBUG=nocompress, but it won't fix anything, it'll just make it black instead of magenta (and may makes some glitches less bad...)

I tried running both Xwayland and Java with software rendering and still got weird redraw bugs and sometimes a persistently broken window with Java OpenGL, so at this point I'm pretty confident it has nothing to do with our driver. It could be an upstream Mesa regression too...

asahilina commented 1 year ago

I've updated the OP to explain what the story with magenta glitches is and what you can do to help us figure out if it's an app bug or a driver bug ^^

marcan commented 1 year ago

Fixes for the Emacs, three.js, and Darwinia issues are now available in the pacman repo.

psanford commented 1 year ago

I just tried the latest mesa driver. It does indeed fix the emacs issue :+1: . However I am now seeing some rendering issues in chromium where a lot of fonts and images are not appearing:

Screenshot_2023-06-15_08-52-27

jannau commented 1 year ago

I just tried the latest mesa driver. It does indeed fix the emacs issue +1 . However I am now seeing some rendering issues in chromium where a lot of fonts and images are not appearing:

Try deleting chromium's shader cache ~/ .config/chromium/Default/GPUCache. See this upstream bug

mkurz commented 1 year ago
  • [ ] Window corruption / magenta regions with Java OpenGL rendering enabled (also happens with software rendering, except then it's black)

If that is a bug in Java, is anyone aware of an upstream bug (I could not find one)? If not, maybe @marcan or @asahilina should open a new bug report: https://bugreport.java.com/bugreport/ (Since you are the experts...)

janrinze commented 1 year ago

Hope this doesn't sound pedantic but Emacs, three.js, and Darwinia were not changed. The asahi kernel and asahi-mesa have been updated to solve bugs that became apparent when using Emacs, three.js, and Darwinia.
Kudos to those who have been working hard to get these bugs fixed!

Please be aware that there are more bugs that require attention. Most of them are benign and don't lock up the desktop or make the kernel hang (this did happen with the three.js page at several occasions..) There is definitively a need to keep testing (on a broad range of M1/M2 platforms).

We know that asahi-linux and asahi-mesa are in an early development stage and we can't expect it to be flawless. There are many people involved and each doing their best to produce bug free code. Being human means we can't be expected to be perfect and as such or best efforts will still produce bugs at times in code. The user base for asahi-linux is growing steadily and thus the number of bugs discovered will be growing steadily. That's just how that works. How to deal with more users and thus more reports will be an interesting issue to tackle.

For now there are lots of challenges left for both teams and hopefully we will see more announcements such as the OpenGL3.1. (Dual monitor support , anyone? ;-) )