Open Sohojoe opened 5 years ago
OK - the problem is with the T4 GPU - I've been able to get it running with the default GPU.
It would be good to figure this out as the T4 is 1/3rd of the price
@ervteng Do you know about using different GPUs in this scenario?
I've been able to use both T4 and P4 GPUs for training Unity environments (including Obstacle Tower). @Sohojoe do you have the /etc/X11/xorg.conf
for the problematic machine?
here you go:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 410.72
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
Section "Files"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Tesla T4"
BusID "PCI:0:4:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "UseDisplayDevice" "None"
SubSection "Display"
Virtual 1280 1024
Depth 24
EndSubSection
EndSection
These are the options it gives me:
I've been getting the same error too.I am using a T4 and have done all the previous steps completely.Here is my xorg.conf file:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 410.72
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
Section "Files"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
HorizSync 28.0 - 33.0
VertRefresh 43.0 - 72.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Tesla T4"
BusID "0:4:0"
Option "AllowEmptyInitialConfiguration"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "UseDisplayDevice" "None"
SubSection "Display"
Virtual 1280 1024
Depth 24
EndSubSection
EndSection
Any suggestion on this? I also encounter into this issue.
I find the solution and it works for me:
delete or comment(with "#") ServerLayout
and Screen
section in /etc/X11/xorg.conf
file
same issue & solution for tesla V100
For me only removing Option "UseDisplayDevice" "none"
in "Screen" Section does also the trick.
@zhenghongzhi @juge2 guys you've helped us so much! thank you!
Update: GCP tutorial suggests using T4 GPU to save costs, but fails when using T4 GPU (error below)
Hi, I am following the tutorial Training an Obstacle Tower agent using Dopamine and the Google Cloud Platform
I am getting the following error - I believe the problem is
(EE) NVIDIA(GPU-0): UseDisplayDevice "None" is not supported with GRID
- but I'm not sure of the root cause.I was trying to use the T4 GPU to save $$ - I will try again with the default GPU
after typing
I get this error
/var/log/Xorg.0.log