Closed maxpain closed 1 year ago
@ehfd could you help me, please?
Upgrade your driver. Use the latest minor release of each major release if you are in the 535 or 550 branch. Versions earlier than 535.113.01 or 550.67 have bugs.
VIDEO_PORT to DP-0 perhaps.
I tried DP-O, DP-1, DP-2, DFP. Only "none" works.
VIDEO_PORT to DP-0 perhaps.
"none" is not optimal. What's your environment?
@ehfd Kubernetes cluster, nvidia-container-toolkit, NVIDIA device plugin, Talos Linux, RTX4090 with 535.86.05 nvidia driver.
Similar issue with egl desktop. Perhaps an issue with driver 535.
@ehfd Hmm, I don't have any issues with EGL desktop on 535.
Mostly because we use Xvfb in EGL desktop variant, not Xorg.
I reproduce the error... Immediate directive is NOT to upgrade to NVIDIA 535, yet.
In NVIDIA 535.86.05 with Option "ModeDebug" "True"
inserted in /etc/X11/xorg.conf
for debugging:
GPU extended capability check failed.
is the key message.
[ 2711.450] (II) NVIDIA(GPU-0): --- Building ModePool for DFP-1 ---
[ 2711.450] (**) NVIDIA(GPU-0): Mode Validation Overrides for DFP-1:
[ 2711.450] (**) NVIDIA(GPU-0): NoMaxSizeCheck
[ 2711.450] (**) NVIDIA(GPU-0): NoVirtualSizeCheck
[ 2711.450] (**) NVIDIA(GPU-0): NoMaxPClkCheck
[ 2711.450] (**) NVIDIA(GPU-0): NoEdidMaxPClkCheck
[ 2711.450] (**) NVIDIA(GPU-0): NoHorizSyncCheck
[ 2711.450] (**) NVIDIA(GPU-0): NoVertRefreshCheck
[ 2711.451] (**) NVIDIA(GPU-0): NoExtendedGpuCapabilitiesCheck
[ 2711.451] (**) NVIDIA(GPU-0): NoTotalSizeCheck
[ 2711.451] (**) NVIDIA(GPU-0): NoDualLinkDVICheck
[ 2711.451] (**) NVIDIA(GPU-0): NoDisplayPortBandwidthCheck
[ 2711.451] (**) NVIDIA(GPU-0): AllowNon3DVisionModes
[ 2711.451] (**) NVIDIA(GPU-0): AllowNonEdidModes
[ 2711.451] (**) NVIDIA(GPU-0): AllowNonHDMI3DModes
[ 2711.451] (**) NVIDIA(GPU-0): NoEdidHDMI2Check
[ 2711.451] (**) NVIDIA(GPU-0): AllowDpInterlaced
(OMITTED)
[ 2711.454] (WW) NVIDIA(GPU-0): Validating Mode "1920x1080_60":
[ 2711.454] (WW) NVIDIA(GPU-0): Mode Source: X Configuration file ModeLine
[ 2711.454] (WW) NVIDIA(GPU-0): 1920 x 1080 @ 60 Hz
[ 2711.454] (WW) NVIDIA(GPU-0): Pixel Clock : 138.50 MHz
[ 2711.454] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 1968
[ 2711.454] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2000, 2080
[ 2711.454] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1080, 1083
[ 2711.454] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1088, 1111
[ 2711.454] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V
[ 2711.454] (WW) NVIDIA(GPU-0): DualHead Mode: No
[ 2711.454] (WW) NVIDIA(GPU-0): Viewport
[ 2711.454] (WW) NVIDIA(GPU-0): Horizontal Taps
[ 2711.454] (WW) NVIDIA(GPU-0): Vertical Taps
[ 2711.454] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[ 2711.454] (WW) NVIDIA(GPU-0): Mode "1920x1080_60" is invalid.
[ 2711.470] (WW) NVIDIA(GPU-0): Validating Mode "1280x800_60":
[ 2711.470] (WW) NVIDIA(GPU-0): Mode Source: X Server
[ 2711.470] (WW) NVIDIA(GPU-0): 1280 x 800 @ 60 Hz
[ 2711.470] (WW) NVIDIA(GPU-0): Pixel Clock : 71.00 MHz
[ 2711.470] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1280, 1328
[ 2711.470] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 1360, 1440
[ 2711.470] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 800, 803
[ 2711.470] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 809, 823
[ 2711.470] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V
[ 2711.470] (WW) NVIDIA(GPU-0): DualHead Mode: No
[ 2711.470] (WW) NVIDIA(GPU-0): Viewport
[ 2711.470] (WW) NVIDIA(GPU-0): Horizontal Taps
[ 2711.470] (WW) NVIDIA(GPU-0): Vertical Taps
[ 2711.470] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[ 2711.470] (WW) NVIDIA(GPU-0): Mode "1280x800_60" is invalid.
[ 2711.471] (WW) NVIDIA(GPU-0): Validating Mode "1920x1200_60":
[ 2711.471] (WW) NVIDIA(GPU-0): Mode Source: X Server
[ 2711.471] (WW) NVIDIA(GPU-0): 1920 x 1200 @ 60 Hz
[ 2711.471] (WW) NVIDIA(GPU-0): Pixel Clock : 154.00 MHz
[ 2711.471] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 1968
[ 2711.471] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2000, 2080
[ 2711.471] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1200, 1203
[ 2711.471] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1209, 1235
[ 2711.471] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V
[ 2711.471] (WW) NVIDIA(GPU-0): DualHead Mode: No
[ 2711.471] (WW) NVIDIA(GPU-0): Viewport
[ 2711.471] (WW) NVIDIA(GPU-0): Horizontal Taps
[ 2711.471] (WW) NVIDIA(GPU-0): Vertical Taps
[ 2711.471] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[ 2711.471] (WW) NVIDIA(GPU-0): Mode "1920x1200_60" is invalid.
[ 2711.472] (WW) NVIDIA(GPU-0): Validating Mode "800x600_60":
[ 2711.472] (WW) NVIDIA(GPU-0): Mode Source: NVIDIA Predefined
[ 2711.472] (WW) NVIDIA(GPU-0): 800 x 600 @ 60 Hz
[ 2711.472] (WW) NVIDIA(GPU-0): Pixel Clock : 40.00 MHz
[ 2711.472] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 800, 840
[ 2711.472] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 968, 1056
[ 2711.472] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 600, 601
[ 2711.472] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 605, 628
[ 2711.472] (WW) NVIDIA(GPU-0): Sync Polarity : +H +V
[ 2711.472] (WW) NVIDIA(GPU-0): DualHead Mode: No
[ 2711.472] (WW) NVIDIA(GPU-0): Viewport
[ 2711.472] (WW) NVIDIA(GPU-0): Horizontal Taps
[ 2711.472] (WW) NVIDIA(GPU-0): Vertical Taps
[ 2711.472] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[ 2711.472] (WW) NVIDIA(GPU-0): Mode "800x600_60" is invalid.
[ 2711.472] (WW) NVIDIA(GPU-0):
[ 2711.472] (EE) NVIDIA(GPU-0): Unable to add conservative default mode "nvidia-auto-select".
[ 2711.472] (EE) NVIDIA(GPU-0): Unable to add "nvidia-auto-select" mode to ModePool.
[ 2711.472] (WW) NVIDIA(0): No valid modes for "DFP-1:1920x1080R"; removing.
[ 2711.472] (WW) NVIDIA(0):
[ 2711.472] (WW) NVIDIA(0): Unable to validate any modes; falling back to the default mode
[ 2711.472] (WW) NVIDIA(0): "nvidia-auto-select".
[ 2711.472] (WW) NVIDIA(0):
[ 2711.472] (WW) NVIDIA(0): No valid modes for "DFP-1:nvidia-auto-select"; removing.
[ 2711.472] (EE) NVIDIA(0): Unable to use default mode "nvidia-auto-select".
[ 2711.472] (EE) NVIDIA(0): Failing initialization of X screen
Works up to 530.41.03.
X.Org X Server 1.21.1.4
X Protocol Version 11, Revision 0
Current Operating System: Linux xgl-test 5.4.0-153-generic #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-153-generic root=UUID=b74b4d9b-e7b1-4dc6-be2e-bf94365e04ed ro maybe-ubiquity
xorg-server 2:21.1.4-2ubuntu1.7~22.04.1 (For technical support please see http://www.ubuntu.com/support)
Current version of pixman: 0.40.0
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/home/user/.local/share/xorg/Xorg.0.log", Time: Wed Aug 2 04:39:32 2023
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
[746936.617] (II) NVIDIA(GPU-0): Validating Mode "2560x1600_60":
[746936.617] (II) NVIDIA(GPU-0): Mode Source: X Server
[746936.617] (II) NVIDIA(GPU-0): 2560 x 1600 @ 60 Hz
[746936.617] (II) NVIDIA(GPU-0): Pixel Clock : 268.50 MHz
[746936.617] (II) NVIDIA(GPU-0): HRes, HSyncStart : 2560, 2608
[746936.617] (II) NVIDIA(GPU-0): HSyncEnd, HTotal : 2640, 2720
[746936.617] (II) NVIDIA(GPU-0): VRes, VSyncStart : 1600, 1603
[746936.617] (II) NVIDIA(GPU-0): VSyncEnd, VTotal : 1609, 1646
[746936.617] (II) NVIDIA(GPU-0): Sync Polarity : +H -V
[746936.617] (II) NVIDIA(GPU-0): Viewport 2560x1600+0+0
[746936.617] (II) NVIDIA(GPU-0): Horizontal Taps 1
[746936.617] (II) NVIDIA(GPU-0): Vertical Taps 1
[746936.617] (II) NVIDIA(GPU-0): Mode "2560x1600_60" is valid.
[746936.617] (II) NVIDIA(GPU-0):
[746936.617] (II) NVIDIA(GPU-0): Validating Mode "1280x800d60":
[746936.617] (II) NVIDIA(GPU-0): Mode Source: X Server
[746936.617] (II) NVIDIA(GPU-0): 1280 x 800 @ 60 Hz
[746936.617] (II) NVIDIA(GPU-0): Pixel Clock : 134.25 MHz
[746936.617] (II) NVIDIA(GPU-0): HRes, HSyncStart : 1280, 1304
[746936.617] (II) NVIDIA(GPU-0): HSyncEnd, HTotal : 1320, 1360
[746936.617] (II) NVIDIA(GPU-0): VRes, VSyncStart : 800, 801
[746936.617] (II) NVIDIA(GPU-0): VSyncEnd, VTotal : 804, 823
[746936.617] (II) NVIDIA(GPU-0): Sync Polarity : +H -V
[746936.617] (II) NVIDIA(GPU-0): Extra : DoubleScan
[746936.617] (II) NVIDIA(GPU-0): Viewport 1280x800+0+0
[746936.617] (II) NVIDIA(GPU-0): Horizontal Taps 2
[746936.617] (II) NVIDIA(GPU-0): Vertical Taps 2
[746936.617] (II) NVIDIA(GPU-0): Mode "1280x800d60" is valid.
[746936.617] (II) NVIDIA(GPU-0):
[746936.617] (II) NVIDIA(GPU-0): Validating Mode "2560x1600_60":
[746936.617] (II) NVIDIA(GPU-0): Mode Source: X Server
[746936.617] (II) NVIDIA(GPU-0): 2560 x 1600 @ 60 Hz
[746936.617] (II) NVIDIA(GPU-0): Pixel Clock : 348.50 MHz
[746936.617] (II) NVIDIA(GPU-0): HRes, HSyncStart : 2560, 2760
[746936.617] (II) NVIDIA(GPU-0): HSyncEnd, HTotal : 3032, 3504
[746936.617] (II) NVIDIA(GPU-0): VRes, VSyncStart : 1600, 1603
[746936.617] (II) NVIDIA(GPU-0): VSyncEnd, VTotal : 1609, 1658
[746936.617] (II) NVIDIA(GPU-0): Sync Polarity : -H +V
[746936.617] (II) NVIDIA(GPU-0): Viewport 2560x1600+0+0
[746936.617] (II) NVIDIA(GPU-0): Horizontal Taps 1
[746936.617] (II) NVIDIA(GPU-0): Vertical Taps 1
[746936.617] (II) NVIDIA(GPU-0): Mode "2560x1600_60" is valid.
[746936.617] (II) NVIDIA(GPU-0):
[746936.617] (II) NVIDIA(GPU-0): Validating Mode "1280x800d60":
[746936.617] (II) NVIDIA(GPU-0): Mode Source: X Server
[746936.617] (II) NVIDIA(GPU-0): 1280 x 800 @ 60 Hz
[746936.617] (II) NVIDIA(GPU-0): Pixel Clock : 174.25 MHz
[746936.617] (II) NVIDIA(GPU-0): HRes, HSyncStart : 1280, 1380
[746936.617] (II) NVIDIA(GPU-0): HSyncEnd, HTotal : 1516, 1752
[746936.617] (II) NVIDIA(GPU-0): VRes, VSyncStart : 800, 801
[746936.617] (II) NVIDIA(GPU-0): VSyncEnd, VTotal : 804, 829
[746936.617] (II) NVIDIA(GPU-0): Sync Polarity : -H +V
[746936.617] (II) NVIDIA(GPU-0): Extra : DoubleScan
[746936.617] (II) NVIDIA(GPU-0): Viewport 1280x800+0+0
[746936.617] (II) NVIDIA(GPU-0): Horizontal Taps 2
[746936.617] (II) NVIDIA(GPU-0): Vertical Taps 2
[746936.617] (II) NVIDIA(GPU-0): Mode "1280x800d60" is valid.
[746936.617] (II) NVIDIA(GPU-0):
[746936.618] (II) NVIDIA(GPU-0): --- Done building ModePool for DFP-2 ---
[746936.618] (II) NVIDIA(GPU-0):
[746936.618] (II) NVIDIA(GPU-0): Frequency information for DFP-2:
[746936.618] (II) NVIDIA(GPU-0): HorizSync : 28.000-55.000 kHz
[746936.618] (II) NVIDIA(GPU-0): VertRefresh : 43.000-72.000 Hz
[746936.618] (II) NVIDIA(GPU-0): (HorizSync from Conservative Defaults)
[746936.618] (II) NVIDIA(GPU-0): (VertRefresh from Conservative Defaults)
And in 525.60.13.
X.Org X Server 1.21.1.4
X Protocol Version 11, Revision 0
Current Operating System: Linux xgl-test 5.4.0-148-generic #165-Ubuntu SMP Tue Apr 18 08:53:12 UTC 2023 x86_64
Kernel command line: BOOT_IMAGE=/vmlinuz-5.4.0-148-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro
xorg-server 2:21.1.4-2ubuntu1.7~22.04.1 (For technical support please see http://www.ubuntu.com/support)
Current version of pixman: 0.40.0
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/home/user/.local/share/xorg/Xorg.0.log", Time: Wed Aug 2 04:44:04 2023
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
I've emailed the NVIDIA driver team. Waiting for response.
TO OUR USERS:
Please send an email to linux-bugs@nvidia.com
that you are a user of https://github.com/selkies-project/docker-nvidia-glx-desktop and that you are also affected by the below issue. This is the only way to accelerate the bug fix in the drivers, and if this issue is not fixed, this repository may not be usable on later drivers.
We have reproduced an issue that all of our users using the 535.86.05 drivers have also faced, where the "NoExtendedGpuCapabilitiesCheck" option in "ModeValidation" for xorg.conf is not honored in GeForce GPUs.
This is a new issue that has arised which did not exist in 530.xx, 525.xx, and any other earlier drivers, and is reproducible in every user using headless setups in GeForce (so far, all of 10xx, 20xx, and 30xx GPUs).
How to reproduce: In a using port with no monitor connected for ConnectedMonitor (e.g. DP-0) to enable XRandR, and use Option "ModeValidation" "NoMaxPClkCheck, NoEdidMaxPClkCheck, NoMaxSizeCheck, NoHorizSyncCheck, NoVertRefreshCheck, NoVirtualSizeCheck, NoExtendedGpuCapabilitiesCheck, NoTotalSizeCheck, NoDualLinkDVICheck, NoDisplayPortBandwidthCheck, AllowNon3DVisionModes, AllowNonHDMI3DModes, AllowNonEdidModes, NoEdidHDMI2Check, AllowDpInterlaced" to have the Modes pass the tests.
We have also turned on Option "ModeDebug" "True" for debugging.
Result:
[ 2711.454] (WW) NVIDIA(GPU-0): Validating Mode "1920x1080_60":
[ 2711.454] (WW) NVIDIA(GPU-0): Mode Source: X Configuration file ModeLine
[ 2711.454] (WW) NVIDIA(GPU-0): 1920 x 1080 @ 60 Hz
[ 2711.454] (WW) NVIDIA(GPU-0): Pixel Clock : 138.50 MHz
[ 2711.454] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 1968
[ 2711.454] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2000, 2080
[ 2711.454] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1080, 1083
[ 2711.454] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1088, 1111
[ 2711.454] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V
[ 2711.454] (WW) NVIDIA(GPU-0): DualHead Mode: No
[ 2711.454] (WW) NVIDIA(GPU-0): Viewport
[ 2711.454] (WW) NVIDIA(GPU-0): Horizontal Taps
[ 2711.454] (WW) NVIDIA(GPU-0): Vertical Taps
[ 2711.454] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[ 2711.454] (WW) NVIDIA(GPU-0): Mode "1920x1080_60" is invalid.
[ 2711.454] (WW) NVIDIA(GPU-0):
This is a behavior which does not coincide with the README documentation, and therefore has to be fixed.
------
On a separate note, there is a separate issue which is not a blocking issue (existed long before NVIDIA 535 drivers), where the HDMI or DVI (including the virtual DVI ports in supported Tesla/Datacenter GPUs where the maximum resolution is stuck at a maximum of 2560 x 1600 at 60 hz) ports are stuck at 165.0 MHz maximum pixel clock, and the "NoMaxPClkCheck" "ModeValidation" and related options are never honored. This makes headless GPUs with a "ConnectedMonitor" option on an HDMI or DVI port not able to use Modes above 1920x1200 at 60 hz resolutions.
[2363014.704] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:33:0:0
[2363014.704] (--) NVIDIA(0): DFP-0
[2363014.704] (--) NVIDIA(0): DFP-1
[2363014.704] (--) NVIDIA(0): DFP-2
[2363014.704] (--) NVIDIA(0): DFP-3
[2363014.704] (--) NVIDIA(0): DFP-4
[2363014.704] (--) NVIDIA(0): DFP-5
[2363014.705] (**) NVIDIA(0): Using ConnectedMonitor string "DFP-0".
[2363014.707] (II) NVIDIA(0): NVIDIA GPU NVIDIA GeForce RTX 3090 (GA102-A) at PCI:33:0:0
[2363014.707] (II) NVIDIA(0): (GPU-0)
[2363014.707] (--) NVIDIA(0): Memory: 25165824 kBytes
[2363014.707] (--) NVIDIA(0): VideoBIOS: 94.02.42.40.34
[2363014.707] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[2363014.711] (--) NVIDIA(GPU-0): DFP-0: connected
[2363014.711] (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
[2363014.711] (--) NVIDIA(GPU-0): DFP-0 Name Aliases:
[2363014.711] (--) NVIDIA(GPU-0): DFP
[2363014.711] (--) NVIDIA(GPU-0): DFP-0
[2363014.711] (--) NVIDIA(GPU-0): DPY-0
[2363014.711] (--) NVIDIA(GPU-0): HDMI-0
[2363014.712] (--) NVIDIA(GPU-0): HDMI-0
[2363014.712] (--) NVIDIA(GPU-0): Connector-3
[2363014.712] (--) NVIDIA(GPU-0): DFP-0: 165.0 MHz maximum pixel clock
[2363014.712] (--) NVIDIA(GPU-0):
[2363014.714] (WW) NVIDIA(GPU-0): Validating Mode "1920x1440_60":
[2363014.714] (WW) NVIDIA(GPU-0): Mode Source: VESA
[2363014.714] (WW) NVIDIA(GPU-0): 1920 x 1440 @ 60 Hz
[2363014.714] (WW) NVIDIA(GPU-0): Pixel Clock : 234.00 MHz
[2363014.714] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 2048
[2363014.714] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2256, 2600
[2363014.714] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1440, 1441
[2363014.714] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1444, 1500
[2363014.714] (WW) NVIDIA(GPU-0): Sync Polarity : -H +V
[2363014.714] (WW) NVIDIA(GPU-0): Mode is rejected: Unable to construct hardware-specific
[2363014.714] (WW) NVIDIA(GPU-0): mode timings.
[2363014.714] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[2363014.714] (WW) NVIDIA(GPU-0): Mode "1920x1440_60" is invalid.
[2363014.714] (WW) NVIDIA(GPU-0):
[2363014.714] (WW) NVIDIA(GPU-0): Validating Mode "1920x1440_75":
[2363014.714] (WW) NVIDIA(GPU-0): Mode Source: VESA
[2363014.714] (WW) NVIDIA(GPU-0): 1920 x 1440 @ 75 Hz
[2363014.714] (WW) NVIDIA(GPU-0): Pixel Clock : 297.00 MHz
[2363014.714] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 1920, 2064
[2363014.714] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2288, 2640
[2363014.714] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1440, 1441
[2363014.714] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1444, 1500
[2363014.714] (WW) NVIDIA(GPU-0): Sync Polarity : -H +V
[2363014.714] (WW) NVIDIA(GPU-0): Mode is rejected: Unable to construct hardware-specific
[2363014.714] (WW) NVIDIA(GPU-0): mode timings.
[2363014.714] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[2363014.714] (WW) NVIDIA(GPU-0): Mode "1920x1440_75" is invalid.
[2363014.714] (WW) NVIDIA(GPU-0):
[2363014.714] (WW) NVIDIA(GPU-0): Validating Mode "2560x1440_60":
[2363014.714] (WW) NVIDIA(GPU-0): Mode Source: X Configuration file ModeLine
[2363014.714] (WW) NVIDIA(GPU-0): 2560 x 1440 @ 60 Hz
[2363014.714] (WW) NVIDIA(GPU-0): Pixel Clock : 241.50 MHz
[2363014.714] (WW) NVIDIA(GPU-0): HRes, HSyncStart : 2560, 2608
[2363014.714] (WW) NVIDIA(GPU-0): HSyncEnd, HTotal : 2640, 2720
[2363014.714] (WW) NVIDIA(GPU-0): VRes, VSyncStart : 1440, 1443
[2363014.714] (WW) NVIDIA(GPU-0): VSyncEnd, VTotal : 1448, 1481
[2363014.714] (WW) NVIDIA(GPU-0): Sync Polarity : +H -V
[2363014.714] (WW) NVIDIA(GPU-0): Mode is rejected: Unable to construct hardware-specific
[2363014.714] (WW) NVIDIA(GPU-0): mode timings.
[2363014.714] (WW) NVIDIA(GPU-0): GPU extended capability check failed.
[2363014.714] (WW) NVIDIA(GPU-0): Mode "2560x1440_60" is invalid.
This separate note also does not coincide with the README documentation, this time originating way before the 535.xx drivers.
@maxpain https://forums.developer.nvidia.com/t/if-you-have-a-problem-please-read-this-first/27131
Could you (as well as everyone else affected) provide a nvidia-bug-report.log.gz after facing the error when running Xorg, either here or the NVIDIA forum post above?
As many people as possible is good.
NVIDIA has added this issue to their internal tracker.
From @xhejtman in the Discord:
what is the issue with nvidia drivers and no resolution available? We just tested 535 drivers on A10 gpu and it gets all resolutions available. Is that desktop card specific?
Perhaps it could be, or the new driver release fixed things. CC @maxpain
Good news: NVIDIA said they found the source of the issue and they will ship the fix in the next release. Now, we have to pray that all of the issues have indeed been properly fixed.
Maybe this issue was fixed in 535.129.03 and 545.29.02. I tested the drivers on Ubuntu 22.04 with RTX 4060 Ti.
Release highlights since 535.113.01:
Fixed a bug that could cause modes to fail validation when Option "ModeValidation" "NoExtendedGpuCapabilitiesCheck" is specified in xorg.conf.
Fixed a bug that could cause GPU memory utilization to be reported incorrectly for Multi-Instance GPU (MIG) partitions on Grace Hopper systems.
Fixed a bug that intermittently caused the display to freeze when resuming from suspend on some Ada GPUs.
Fixed a bug which could cause some DisplayPort monitors to flicker.
Fixed a bug that could cause monitors to flicker when the performance state changes on Turing GPUs.
Release highlights since 535.113.01:
Added experimental HDMI 10 bits per component support; enable by loading nvidia-modeset with hdmi_deepcolor=1.
Added support for the CTM, DEGAMMA_LUT, and GAMMA_LUT DRM-KMS CRTC properties. These are used by features such as the “Night Light” feature in GNOME and the “Night Color” feature in KDE, when they are used as Wayland compositors.
Added support for GeForce and Workstation GPUs to the open kernel modules. Please see the “Open Linux Kernel Modules” chapter in the README for details.
Added initial experimental support for runtime D3 (RTD3) power management on Desktop GPUs. Please see the ‘PCI-Express Runtime D3 (RTD3) Power Management’ chapter in the README for more details.
Added support for the EGL_ANDROID_native_fence_sync EGL extension and the VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT and VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT Vulkan external handle types when the nvidia-drm kernel module is loaded with the modeset=1 parameter.
Added experimental support for framebuffer consoles provided by nvidia-drm. On kernels that implement drm_fbdev_generic_setup and drm_aperture_remove_conflicting_pci_framebuffers, nvidia-drm will install a framebuffer console when loaded with both modeset=1 and fbdev=1 kernel module parameters. This will replace the Linux boot console driven by a system framebuffer driver such as efifb or vesafb.
Note that when an nvidia-drm framebuffer console is enabled, unloading nvidia-drm will cause the screen to turn off.
Updated nvidia-installer to allow installing the driver while an existing NVIDIA driver is already loaded.
Added support for virtual reality displays, such as the SteamVR platform, on Wayland compositors that support DRM leasing. Support requires xwayland version 22.1.0 and wayland-protocols version 1.22, or later. Tested on sway, minimum version 1.7 with wlroots version 0.15, and also on Kwin, minimum version 5.24.
Note: Before xwayland 23.2, there is a known issue with HDMI displays where the headset will fail to start a second time after closing SteamVR. This can be worked around by unplugging and replugging in the headset.
Fixed a bug that prevented VRR (Variable Refresh Rate) from working with Wayland.
Added support to the NVIDIA VDPAU driver for running in Xwayland. Please refer to the “Xwayland support in VDPAU” section of the README for further details.
Added libnvidia-gpucomp.so to the driver package. This is a helper library used for GPU shader compilation.
Removed libnvidia-vulkan-producer.so from the driver package. This helper library is no longer needed by the Wayland WSI.
Fixed a bug that intermittently caused the display to freeze when resuming from suspend on some Ada GPUs.
Fixed a bug that could cause monitors to flicker when the performance state changes on Turing GPUs.
Added support for HDR signaling via the HDR_OUTPUT_METADATA and Colorspace per-connector DRM properties when nvidia-drm is loaded with the modeset=1 parameter.
Added support for PRIME render offload to Vulkan Wayland WSI.
Fixed a bug that could cause modes to fail validation when Option "ModeValidation" "NoExtendedGpuCapabilitiesCheck" is specified in xorg.conf.
Fixed a bug which could cause some DisplayPort monitors to flicker.
It seems to be the case @bongole. I will check if all edge cases were addressed.
@bongole What's the environment that made it work? Is it this container?
@ehfd
I tested below command on bare metal Ubuntu-22.04 server with RTX 4060 Ti.
docker run --gpus all -it --rm --tmpfs /dev/shm:rw -e SIZEW=1920 -e SIZEH=1080 -e REFRESH=60 -e DPI=96 -e CDEPTH=24 -e VIDEO_PORT=DFP -e PASSWD=mypasswd -e WEBRTC_ENCODER=nvh264enc -e BASIC_AUTH_PASSWORD=mypasswd -e ENABLE_HTTPS_WEB=true --network host ghcr.io/selkies-project/nvidia-glx-desktop:latest
OS Info:
$ uname -a
Linux gpu-server 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ nvidia-smi
Thu Nov 9 11:15:09 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02 Driver Version: 545.29.02 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti Off | 00000000:01:00.0 Off | N/A |
| 32% 29C P0 29W / 165W | 4MiB / 16380MiB | 3% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
I cannot confirm on 535.129.03 because my testing node is currently broken. Information regarding this is appreciated.
A kind user has also confirmed with version 535.129.03 for me. Issue resolved.
Conclusion: if you face this issue, Use Display Driver Versions >= 535.129.03 or 545.29.02, or <= 530.xx. Don't use headless drivers because they lack certain libraries.
NVIDIA 550 drivers <= 550.5x have issues with Vulkan. Use 550.67 or higher.
Hello. I'm trying to run this container in my home Kubernetes cluster on Talos Linux with RTX4090 GPU. Nvidia driver: 535.86.05