intel / media-driver

Intel Graphics Media Driver to support hardware decode, encode and video processing.
https://github.com/intel/media-driver/wiki
Other
965 stars 344 forks source link

[Bug]: [drm] *ERROR* CPU pipe A FIFO underrun port,transcoder Intel ARC #1750

Open nzrf opened 9 months ago

nzrf commented 9 months ago

Which component impacted?

Intel ARC A750 and related System

Is it regression? Good in old configuration?

None

What happened?

System hang and becomes unresponsive. No video displayed anymore. The last thing that appears in the log every time is. i915 0000:03:00.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,

This will randomly happen, but not very often (every week to 30 days) Unsure what puts it in quite the state.

System is fully patched Fedora 38 Patched

Though this has happened with a number of kernels and kernel options still things seem to becoming unstable. Currently running options "i915.force_probe=56a1 i915.enable_psr=0" which can be turned off at any time, but this has happened with and without. So may as well remove them. Thought it had resolve it, but was wrong.

System installed has the following intel packages installed

intel-media-driver-23.1.6-1.fc38.x86_64
intel-gmmlib-22.3.12-1.fc38.x86_64
intel-gpu-firmware-20231111-1.fc38.noarch

What's the usage scenario when you are seeing the problem?

Transcode for media delivery

What impacted?

Server hosting plex media server for for local transcoding services and becomes unresponsive and will require a hard boot.

Debug Information

ls /dev/dri by-path card1 renderD128

lspci -nn |grep -Ei 'VGA|DISPLAY' 03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A750] [8086:56a1] (rev 08)

vainfo.log dmesg.log

Do you want to contribute a patch to fix the issue?

None

nzrf commented 9 months ago

Just as note changed out kernel options from. i915.force_probe=56a1 i915.enable_psr=0 and put in place hoping to get catch something a bit more. i915.force_probe=56a1 i915.verbose_state_checks=1 drm.debug=0xe"

UPDATE: Nothing new really same behavior and no additional information. Last Log before the system becomes hung is.

2023-12-27 08:44:26.343 | i915 0000:03:00.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,

If I should use some others kernel options or methods to gather anything else needed let me know.

filipmnowak commented 7 months ago

got same error, and also gnome-shell segfault:

...
i915 0000:00:02.0: [drm] *ERROR* Failed to read DPCD register 0x92
i915 0000:00:02.0: [drm] *ERROR* Failed to read DPCD register 0x92
gnome-shell[6747]: segfault at 8 ip 00007f7f3eb54b4c sp 00007ffee7b06df0 error 4 in libmutter-12.so.0.0.0[7f7f3ea4e000+157000] likely on CPU 2 (core 4, socket 0)
i915 0000:00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,
...

CPU/GPU:

00:00.0 Host bridge: Intel Corporation Raptor Lake-P/U 2p+8e cores Host Bridge/DRAM Controller (rev 01)
00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P [Iris Xe Graphics] (rev 04)

dmesg_2024-01-26.txt lspci_2024-01-26.txt

nzrf commented 7 months ago

Just an update here at least on my own issue. Turned off the cstates on the motherboards and resolved the issue.

I also have not had issue the last week in new ASUS motherboard W680 with a new CPU with the ARC GPU and no issues.

Original system was i8400 with Supermicro C7Z370-CG-L Gaming after disable cstates and no more issues.

filipmnowak commented 7 months ago

Just an update here at least on my own issue. Turned off the cstates on the motherboards and resolved the issue. (...)

thanks for a hint, will try it out!

MartinX3 commented 6 months ago

It suddenly also appears on my ThinkPad T460P with skylake iGPU. i7-6820HQ with Intel HD Graphics 530. I'll try the cstate setting, too.

filipmnowak commented 6 months ago

Just an update here at least on my own issue. Turned off the cstates on the motherboards and resolved the issue.

I also have not had issue the last week in new ASUS motherboard W680 with a new CPU with the ARC GPU and no issues.

Original system was i8400 with Supermicro C7Z370-CG-L Gaming after disable cstates and no more issues.

With those settings issues are pretty much gone (thanks again for the hints):

i915.enable_guc=3 i915.enable_fbc=1 i915.modeset=1 i915.error_capture intel_idle.max_cstate=1

(intel_idle.max_cstate=1 is probably what fixes it for me)

i am not using a laptop, and i really don't care about most of CPU power saving modes, but still - those problems are kind of a bummer.

sr-tamim commented 5 months ago

I am facing screen blinking issue for past week. I couldn't trace anything. I was preparing to re-installing my system. Today, while blinking screen I checked journalctl and found out fifo underrun error. I will try max_cstate fix. But is there any side-effects of this settings?

Device: Thinkpad P52 GPU: Nvidia Quadro P1000 (deactivated), Intel UHD Graphics (active) Processor: Intel i7-8850H OS: Pop OS 22.01

filipmnowak commented 4 months ago

I will try max_cstate fix. But is there any side-effects of this settings?

take a look here. your laptop might consume more energy when idle.

anszom commented 3 months ago

I've been experiencing the same freeze with some regularity. I've recently added intel_idle.max_cstate=1 to cmdline, but the bug occurred again. There was one new message though, something about changing power states.

[1140850.781238] i915 0000:19:00.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,
[1140892.092450] pcieport 0000:18:04.0: Unable to change power state from D3hot to D0, device inaccessible

relevant lspci output:

17:00.0 PCI bridge: Intel Corporation Device 4fa0 (rev 01)
18:01.0 PCI bridge: Intel Corporation Device 4fa4
18:04.0 PCI bridge: Intel Corporation Device 4fa4
19:00.0 VGA compatible controller: Intel Corporation Device 56a1 (rev 08)
1a:00.0 Audio device: Intel Corporation Device 4f90
 +-[0000:16]-+-00.0-[17-1a]----00.0-[18-1a]--+-01.0-[19]----00.0
 |           |                               \-04.0-[1a]----00.0
anszom commented 3 months ago

I got the freeze yet again. I have the intel_idle.max_cstate=1 AND disabled C-states in a bios setting, so I'm pretty sure that disabling C-states is not an universal workaround.