IGCIT / Intel-GPU-Community-Issue-Tracker-IGCIT

IGCIT is a Community-driven issue tracker for Intel GPUs.
GNU General Public License v3.0
115 stars 4 forks source link

Windows RDP crash - A device attached to the system is not functioning #577

Closed nyanmisaka closed 5 months ago

nyanmisaka commented 10 months ago

Checklist [README]

Application [Required]

Intel(R) Arc(TM) A380 Graphics

Processor / Processor Number [Required]

AMD Ryzen 7 5700G

Graphic Card [Required]

Intel(R) Arc(TM) A380 Graphics + AMD Radeon(TM) Graphics

GPU Driver Version [Required]

31.0.101.4952

Rendering API [Required]

Windows Build Number [Required]

Other Windows build number

No response

Intel System Support Utility report

igcit_ssu.txt

Description and steps to reproduce [Required]

  1. Prepare a multi GPU setup: an Intel ARC discrete GPU + AMD integrated GPU
  2. Enable both iGPU and dGPU in the BIOS settings (some motherboard may disable iGPU automatically)
  3. Leave all HDMI/DP/VGA ports un-plugged on both motherboard and ARC GPU, to make it a headless setup
  4. Run mstsc Windows Remote Desktop on another PC and connect to the headless PC
  5. In Windows RDP client you will see it fails at login screen and shows an error "A device attached to the system is not functioning" before the session is disconnected.

When I switch back the dGPU to my older nVidia card, RDP works fine. An workaround is to have a HDMI dummy plug on Arc GPU.

Same issues on reddit - "[Trouble shooting] - A380 causing RDP "crashing" issue."

For me, this issue is not a regression. I encountered this as early as the 39xx drivers.

intel-arc-windows-rdp-crash

Device / Platform

No response

Crash dumps [Required, if applicable]

No response

Application / Windows logs

No response

Karen-Intel commented 10 months ago

@nyanmisaka Ty for your report here. We'll be checking it out! Stay tuned

K

nyanmisaka commented 9 months ago

@Karen-Intel Any news on this? I can still reproduce it on the latest 4972 driver.

multi-vitamin commented 9 months ago

I'm using an a770. Something similar happens with Chrome Remote. When I run remote, I get a screen that changes resolution and then goes to a black screen.

It happens with the monitor turned off. I haven't checked with the monitor on.

Gabriela-Intel commented 7 months ago

Hi @nyanmisaka I just tried to replicate the issue with the config below but I was able to remote successfully into the A380 system. Can you try again using the latest drivers?

AMD Ryzen 5 7600 Intel Arc A380 Graphics Win 11 Pro 22631 31.0.101.3194 32 GB RAM

nyanmisaka commented 7 months ago

Hi @nyanmisaka I just tried to replicate the issue with the config below but I was able to remote successfully into the A380 system. Can you try again using the latest drivers?

AMD Ryzen 5 7600 Intel Arc A380 Graphics Win 11 Pro 22631 31.0.101.3194 32 GB RAM

@Gabriela-Intel Thank you for your time. I've been using the latest driver released in January but the problem still persists.

Could you also test on Windows 10, there are a lot of changes between Windows 11 22631 and Windows 10 19045, especially when it comes to handling multiple graphics cards.

Note that in this configuration, remote desktop may succeed the first time, but it will continue to fail after reboot.

Gabriela-Intel commented 7 months ago

Hey again! I tried on the same config using Windows 10 22H2 and 101.5330 this time but I still can't seem to reproduce even after rebooting and remoting in multiple times.. Any other tips that might help me see the issue?

nyanmisaka commented 7 months ago

Hey again! I tried on the same config using Windows 10 22H2 and 101.5330 this time but I still can't seem to reproduce even after rebooting and remoting in multiple times.. Any other tips that might help me see the issue?

The remaining difference between our setups might be, the default graphics card settings in the BIOS, IGP vs PEG (PCI Express Graphics/Integrated Graphics). My setting is PEG. And no monitor is connected to the motherboard and graphics card of this host.

pcslide commented 7 months ago

Hey again! I tried on the same config using Windows 10 22H2 and 101.5330 this time but I still can't seem to reproduce even after rebooting and remoting in multiple times.. Any other tips that might help me see the issue?

Have you tried to sleep/wakeup the PC remotely?

Gabriela-Intel commented 7 months ago

@nyanmisaka Hm yes my system is set up similarly as well.

@pcslide I lose connection when the system is put to sleep. Is there something I need to enable to ensure I can force it to sleep/wake using a remote connection?

Thanks for the help!

pcslide commented 7 months ago

@Gabriela-Intel If you have enabled Wake-On-LAN on your remote pc, you can try to use https://www.nirsoft.net/utils/wake_on_lan.html. If the remote pc is not on the same LAN as you are on, then you need a proxy to wake the remote pc. Usually, a router or a server, being on the same side of the remote pc, can act as a proxy.

multi-vitamin commented 7 months ago

I think it's something about the HDMI dummy plug. Try testing with nothing connected to the ARC GPU. AMD and NVIDIA have no problem with nothing plugged in.

Gabriela-Intel commented 7 months ago

Thanks @pcslide! Still no luck :( See details below on my set up and steps. Let me know if ANYTHING differs from what you have been doing.

Steps:

  1. Using System 1, remote into System 2 by running mstsc Windows Remote Desktop
  2. Put System 2 to sleep (lose connection)
  3. Use WakeOnLAN to wake System 2 after some time (couple of minutes)
  4. Using System 1, remotely access System 2 again

System 1: i5-12600K Intel Arc A770 Windows 11 23H2 32 GB RAM 101.5330

System 2: AMD Ryzen 5 7600 (iGPU enabled) Intel Arc A380 Windows 10 22H2 32 GB RAM 101.5330 All HDMI/DP/VGA ports un-plugged on both motherboard & Arc card (headless setup)

pcslide commented 7 months ago

@Gabriela-Intel After you did your step 3, did system 2 wake up?
If system 2 can not be waked up remotely, can it be waked up locally ( by using power-button / keyboard)?

multi-vitamin commented 7 months ago

Is iGPU disabled? I'm pretty sure the iGPU should be disabled.

Gabriela-Intel commented 6 months ago

@pcslide Yes, System 2 did wake successfully after step 3. I'm also able to wake it up remotely and locally by using the power button.

Gabriela-Intel commented 6 months ago

@multi-vitamin I also tried it with integrated graphics disabled in the BIOS but to no avail..

multi-vitamin commented 6 months ago

Windows runs with the monitor enabled and monitor is turned off Will it still work in this situation?

Gabriela-Intel commented 6 months ago

@multi-vitamin Can you elaborate?? I'm not following. I used a headless set up for system 2, so there wasn't a monitor connected.

multi-vitamin commented 6 months ago

This seems to happen when it's not a full headless setup.

On Chrome Remote when connected, the resolution changes and immediately goes to black. I boot up with the monitor connected and connect the chrome remote with the monitor off.

pcslide commented 6 months ago

Now that you mentioned that, it seems resolution changes play some roles in causing the issue.

This seems to happen when it's not a full headless setup.

On Chrome Remote when connected, the resolution changes and immediately goes to black. I boot up with the monitor connected and connect the chrome remote with the monitor off.

Gabriela-Intel commented 6 months ago

I tried to replicate the issue again booting with a monitor on using System 2 and remoting in with monitor off. This monitor is using a different resolution than System 1. I also tried using Chrome remote but the issue doesn't occur.

Since we haven't been able to successfully replicate after various attempts, I think we are unfortunately going to have to close this one out. It's an absolute must for us to see the issue to move forward. :(

nyanmisaka commented 6 months ago

Isn't there something that I can do to capture a log or trace for the Windows Remote Desktop Application/Service? If so, you developers can diagnose it and see what's going on.

Gabriela-Intel commented 6 months ago

Let me investigate internally about the logs/trace files that we can maybe pass along. I'll get back to you on this!

Gabriela-Intel commented 6 months ago

Hey everyone. Please run these commands as Administrator in the command line:

if not exist C:\AppCrashDumps\NUL mkdir C:\AppCrashDumps

reg add "HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting" /v "Disabled" /t REG_DWORD /d "0x1" /f
reg add "HKEY_CURRENT_USER\Software\Microsoft\Windows\Windows Error Reporting" /v "Disabled" /t REG_DWORD /d "0x1" /f

reg add "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps" /v "DumpFolder" /t REG_EXPAND_SZ /d "C:\AppCrashDumps" /f

reg add "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps" /v "DumpType" /t REG_DWORD /d "0x2" /f

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl\LiveKernelReports" /v "DeleteLiveMiniDumps" /t REG_DWORD /d "0x0" /f

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl" /v "FilterPages" /t REG_DWORD /d "0x1" /f
reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl" /v "CrashDumpEnabled" /t REG_DWORD /d "0x1" /f

then restart the system.

Basically, this will keep any TDR/Watchdog dump in C:\Window\LiveKernelReport\Watchdog folder or any application crash dump in C:\AppCrashDumps folder

Reproduce the issue and please send us any dump file generated in those folders. Thanks!

Gabriela-Intel commented 6 months ago

Hey again. Anyone have any updates or logs to provide for us to look into?

multi-vitamin commented 6 months ago

I've only tested with Chrome Remote. If it's a black screen, it's visible immediately when I turn on the monitor. However, when I turn off the monitor, reboot and reconnect it, it goes into a black screen.

It seems to be an issue with not being able to specify the initial resolution.

I have a headless from AMD or NVIDIA. When I run chrome remote, it defaults to 1024x768 resolution. It never goes into a black screen.

My system: AMD Ryzen 3600xt Intel Arc A770 Windows 10 22H2 16 GB RAM 101.5333 DP port plugged

Gabriela-Intel commented 6 months ago

@multi-vitamin As mentioned previously, I also checked the behavior using Chrome remote but didn't notice anything abnormal.

Gabriela-Intel commented 6 months ago

Again, we really need those logs to proceed since we can't reproduce :/ I'll check back in a week. If there's no success with logs or replicating, I'll have to continue with closing this out.

nyanmisaka commented 6 months ago

I'll try it this weekend and see if I can get the log.

Gabriela-Intel commented 6 months ago

Awesome. Keep me posted!

nyanmisaka commented 6 months ago

I found that this issue did not cause the application and kernel to crash, but it exited gracefully. So nothing is captured in AppCrashDumps and LiveKernelReport. Instead, errors and warnings can be observed in the Windows event viewer.

I saved some .evtx (Windows XML Event Log) logs from both client and server side, hope it can still be helpful to you team. RemoteDesktopServices-RdpCoreTS.zip

err

Gabriela-Intel commented 6 months ago

Our debug team couldn't find anything related to the graphics driver in the event viewer logs provided. There must be some sort of difference between your Windows image or your set up since we can't get this to replicate successfully.

I searched for Event ID 227 and found this thread windows server 2016 - RemoteApp sporadic failure - Server Fault Maybe it's worth a try.

Otherwise I'd recommend contacting Microsoft directly about this and if they do find something pointing to an issue with the graphics driver then they can escalate directly to us.

Gabriela-Intel commented 5 months ago

Given the above, we can't get much traction on this. Let's close this out. If you manage to capture those logs we requested at some point please let us know.

@IGCIT