Closed Ricks-Lab closed 3 years ago
Here are details for Ubuntu:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"
cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.4 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.4 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
cat /proc/version
Linux version 5.3.0-46-generic (buildd@lcy01-amd64-013) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020
hostnamectl
Static hostname: nexon
Pretty hostname: Nexon
Icon name: computer-server
Chassis: server
Machine ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Boot ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Operating System: Ubuntu 18.04.4 LTS
Kernel: Linux 5.3.0-46-generic
Architecture: x86-64
Hello, here are the details for Arch Linux:
$ lsb_release -a
LSB Version: 1.4
Distributor ID: Arch
Description: Arch Linux
Release: rolling
Codename: n/a
$ cat /etc/lsb-release
LSB_VERSION=1.4
DISTRIB_ID=Arch
DISTRIB_RELEASE=rolling
DISTRIB_DESCRIPTION="Arch Linux"
$ cat /etc/os-release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="0;36"
HOME_URL="https://www.archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux
$ cat /proc/version
Linux version 5.6.5-arch3-1 (linux@archlinux) (gcc version 9.3.0 (Arch Linux 9.3.0-1)) #1 SMP PREEMPT Sun, 19 Apr 2020 13:14:25 +0000
$ hostnamectl
Static hostname: arnold
Icon name: computer-laptop
Chassis: laptop
Machine ID: *********
Boot ID: *********
Operating System: Arch Linux
Kernel: Linux 5.6.5-arch3-1
Architecture: x86-64
Also Arch linux packet manager is called pacman. The Arch wiki is known to be complete and precise. You will find all the information you need to query the package database. I think you can group Arch Linux distro specific commands with its arch-based distros.
Output for Gentoo. Gentoo uses portage as its package manager.
lsb_release -a
LSB Version: n/a
Distributor ID: Gentoo
Description: Gentoo Base System release 2.6
Release: 2.6
Codename: n/a
cat /etc/lsb-release
DISTRIB_ID="Gentoo"
cat /etc/os-release
NAME=Gentoo
ID=gentoo
PRETTY_NAME="Gentoo/Linux"
ANSI_COLOR="1;32"
HOME_URL="https://www.gentoo.org/"
SUPPORT_URL="https://www.gentoo.org/support/"
BUG_REPORT_URL="https://bugs.gentoo.org/"
cat /proc/version
Linux version 5.6.2-gentoo (root@gentoo) (gcc version 9.2.0 (Gentoo 9.2.0-r2 p3)) #1 SMP Mon Apr 6 10:33:34 EDT 2020
@smoe Are you using a Debian distribution? Can you provide lsb_release -a
output? I am updating amdgpu-chk to indicate verified distros.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux bullseye/sid
Release: unstable
Codename: sid
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux bullseye/sid
Release: testing
Codename: bullseye
I would like to now implement a amdgpu driver confirmation function for arch and gentoo. I need help getting the output for verifying if the package is installed. I need the output of the appropriate command so I can write a parser. Need help to post that output here:
pacman -Qs amdgpu rocm
For Gentoo, there's many ways to find out if a package is installed. One is you can use equery which is part of the gentoolkit package:
Example of package not installed
$equery list dev-libs/amdgpu-pro-opencl
!!! No installed packages matching 'dev-libs/amdgpu-pro-opencl'
* Searching for amdgpu-pro-opencl in dev-libs ...
Example of package installed
$equery list dev-libs/amdgpu-pro-opencl
* Searching for amdgpu-pro-opencl ...
[IP-] [ ] dev-libs/amdgpu-pro-opencl-19.30.838629:0
The I
in the brackets indicates the package is currently installed and P
indicates the package is available in the Portage tree.
qlist, which is part of the portage-utils package, can also be used. When the package is installed, qlist shows the query package name. When the package is not installed, nothing is displayed.
$ qlist -I dev-libs/amdgpu-pro-opencl
dev-libs/amdgpu-pro-opencl
$ qlist -I dev-libs/amdgpu-pro-opencl
$
Lastly, if neither of those packages are installed, emerge can be used to check if the package is installed.
Installed
$ emerge -p dev-libs/amdgpu-pro-opencl
These are the packages that would be merged, in order:
Calculating dependencies... done!
[ebuild Rf ~] dev-libs/amdgpu-pro-opencl-19.30.838629
Not installed
$emerge -p dev-libs/amdgpu-pro-opencl
These are the packages that would be merged, in order:
Calculating dependencies... done!
[ebuild N ] dev-util/patchelf-0.10
[ebuild N F ] dev-libs/amdgpu-pro-opencl-19.30.838629 ABI_X86="32 (64)"
The letters contained in the brackets could be any of the following:
N new (not yet installed)
S new SLOT installation (side-by-side versions)
U updating (to another version)
D downgrading (best version seems lower)
r reinstall (forced for some reason, possibly due to slot or sub-slot)
R replacing (remerging same version)
F fetch restricted (must be manually downloaded)
f fetch restricted (already downloaded)
I interactive (requires user input)
B blocked by another package (unresolved conflict)
b blocked by another package (automatically resolved conflict)
I hope these examples gives you an idea of how to query for the package. I did not have to install rocm or similar packages. I haven't tried using all of the utilities built, as I am having an issue with vext in the virtual environment when running amdgpu-monitor. Vext is installed but I ran into the issue described here: https://github.com/stuaxo/vext/issues/61. I haven't looked into this further though, as the Gentoo system with the AMD GPU is headless and runs BOINC in the background.
$ ./amdgpu-monitor
Vext disabled: There was an issue getting the system site packages.
Vext disabled: There was an issue getting the system site packages.
gi import error: No module named 'gi'
gi is required for amdgpu-monitor
In a venv, first install vext: pip install --no-cache-dir vext
Then install vext.gi: pip install --no-cache-dir vext.gi
* **Arch** @berturion - I think the command will be `pacman -Qs amdgpu rocm`
Specifying amdgpu
and rocm
on the same command line results in empty output.
Also, on my machine pacman -Qs rocm
outputs nothing.
$ pacman -Qs amdgpu
local/xf86-video-amdgpu 19.1.0-1 (xorg-drivers)
X.org amdgpu video driver
I haven't tried using all of the utilities built, as I am having an issue with vext in the virtual environment when running amdgpu-monitor. Vext is installed but I ran into the issue described here: stuaxo/vext#61. I haven't looked into this further though, as the Gentoo system with the AMD GPU is headless and runs BOINC in the background.
$ ./amdgpu-monitor Vext disabled: There was an issue getting the system site packages. Vext disabled: There was an issue getting the system site packages. gi import error: No module named 'gi' gi is required for amdgpu-monitor In a venv, first install vext: pip install --no-cache-dir vext Then install vext.gi: pip install --no-cache-dir vext.gi
Running in a venv is not required. You can run the following to meet the package requirements without venv:
sudo -H pip3 install --no-cache-dir -r requirements.txt
@berturion I have made the change. Let me know when you have a chance to try it out.
@CH3CN I have made the change. Let me know when you have a chance to try it out.
If it doesn't work, please run amdgpu-ls --debug
for more details.
@berturion I have made the change. Let me know when you have a chance to try it out.
Ok, I will as soon as possible.
It's a pleasure to help. Though, I think you could save time by installing Arch Linux in a VM and try the commands directly without depending on my feedback. Arch is VERY easy to install with https://www.anarchylinux.org/ in Virtualbox or burned with Balena Etcher or dd
command on a USB stick. It is just a suggestion. I am happy to help.
Hi, author of vext here - I had a quick go at installing amdgpu and can reproduce the the issue with vext, I've got a little bit of free time coming up next week, so hopefully should be able to have a look at this then.
Cheers S
@CH3CN I have made the change. Let me know when you have a chance to try it out.
If it doesn't work, please run
amdgpu-ls --debug
for more details.
@Ricks-Lab, sorry for the delay in responding. I just had time yesterday to checkout the latest changes. I did not run the utils in a venv, so I didn't have to worry about the vext issue. The only issue I have is with amdgpu-monitor not being sized big enough to display the full name of the "Model". Also, what's the graceful way of terminating amdgpu-monitor? I have been using ctrl-c.
I also tried using ROCm instead of the amdgpu-pro-opencl package from Gentoo but my processor and hardware are too old to meet the requirements (PCIe v3 and atomics).
ch3cn@gentoo ~/amdgpu-utils-master $ ./amdgpu-chk
Using python 3.7.7
Python version OK.
Using Linux Kernel 5.7.4-gentoo
OS kernel OK.
Using Linux distribution: Gentoo Base System release 2.6
Distro has been Validated.
Command dpkg not found. Can not determine amdgpu version.
gpu-utils can still be used.
python3 venv is installed
python3-venv OK.
amdgpu-utils-env is NOT available
amdgpu-utils-env should be configured per User Guide.
Environment not configured. WARNING
Not in amdgpu-utils-env (Only needed if you want to duplicate dev env)
amdgpu-utils-env can be activated per User Guide.
ch3cn@gentoo ~/amdgpu-utils-master $ ./amdgpu-ls
Detected GPUs: INTEL: 1, AMD: 1
AMD: amdgpu version: dev-libs/amdgpu-pro-opencl-19.50.967956
AMD: Wattman features enabled: 0xfffd7fff
2 total GPUs, 1 rw, 0 r-only, 0 w-only
Card Number: 0
Vendor: INTEL
Readable: False
Writable: False
Compute: False
Device ID: {'device': '0x0112', 'subsystem_device': '0x0112', 'subsystem_vendor': '0x1849', 'vendor': '0x8086'}
Decoded Device ID: 2nd Generation Core Processor Family Integrated Graphics Controller
Card Model: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
PCIe ID: 00:02.0
Driver: i915
GPU Type: Unsupported
HWmon: None
Card Path: /sys/class/drm/card0/device
System Card Path: /sys/devices/pci0000:00/0000:00:02.0
Card Number: 1
Vendor: AMD
Readable: True
Writable: True
Compute: True
GPU UID: None
Device ID: {'device': '0x67ef', 'subsystem_device': '0x22de', 'subsystem_vendor': '0x1458', 'vendor': '0x1002'}
Decoded Device ID: Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]
Card Model: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] (rev cf)
Display Card Model: Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]
PCIe ID: 01:00.0
Link Speed: 5.0 GT/s PCIe
Link Width: 8
##################################################
Driver: amdgpu
vBIOS Version: xxx-xxx-xxx
Compute Platform: OpenCL 1.2 AMD-APP (3004.6)
GPU Type: PStates
HWmon: /sys/class/drm/card1/device/hwmon/hwmon2
Card Path: /sys/class/drm/card1/device
System Card Path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
##################################################
Current Power (W): 37.179
Power Cap (W): 48.000
Power Cap Range (W): [0, 72]
Fan Enable: 0
Fan PWM Mode: [2, 'Dynamic']
Fan Target Speed (rpm): 973
Current Fan Speed (rpm): 973
Current Fan PWM (%): 31
Fan Speed Range (rpm): [0, 4600]
Fan PWM Range (%): [0, 100]
##################################################
Current GPU Loading (%): 77
Current Memory Loading (%): 64
Current GTT Memory Usage (%): 45.289
Current GTT Memory Used (GB): 1.359
Total GTT Memory (GB): 3.000
Current VRAM Usage (%): 97.992
Current VRAM Used (GB): 1.960
Total VRAM (GB): 2.000
Current Temps (C): {'edge': 61.0}
Critical Temps (C): {'edge': 94.0}
Current Voltages (V): {'vddgfx': 1031}
Vddc Range: ['800mV', '1150mV']
Current Clk Frequencies (MHz): {'mclk': 1750.0, 'sclk': 1212.0}
Current SCLK P-State: [7, '1212Mhz']
SCLK Range: ['214MHz', '1800MHz']
Current MCLK P-State: [1, '1750Mhz']
MCLK Range: ['300MHz', '2000MHz']
Power Profile Mode: 1-3D_FULL_SCREEN
Power DPM Force Performance Level: auto
┌─────────────┬────────────────┐
│Card # │card1 │
├─────────────┼────────────────┤
│Model │Baffin [Radeon R│
│GPU Load % │100 │
│Mem Load % │0 │
│VRAM Usage % │98.889 │
│GTT Usage % │44.691 │
│Power (W) │29.248 │
│Power Cap (W)│48.0 │
│Energy (kWh) │0.09 │
│T (C) │55.0 │
│VddGFX (mV) │1031 │
│Fan Spd (%) │31 │
│Sclk (MHz) │1212 │
│Sclk Pstate │7 │
│Mclk (MHz) │1750 │
│Mclk Pstate │1 │
│Perf Mode │1-3D_FULL_SCREEN│
└─────────────┴────────────────┘
Yes, ctrl-c is the expected way to terminate when not running with the --gui option. Can you try that just make sure it works in your distro? Also it would be good to check out amdgpu-plot. I have also posted a PyPI package and have started another issue thread to discuss issues with it, if you want to give it a try.
The model names are purposely truncated, as they can be too long for a useful display of multiple GPUs.
In case you don't see my post in the other thread about your python command, the correct invocation is:
pip3 install ricks-amdgpu-utils
Yes, ctrl-c is the expected way to terminate when not running with the --gui option. Can you try that just make sure it works in your distro? Also it would be good to check out amdgpu-plot. I have also posted a PyPI package and have started another issue thread to discuss issues with it, if you want to give it a try.
The model names are purposely truncated, as they can be too long for a useful display of multiple GPUs.
Great. The computer is headless but I used X forwarding over SSH to run amdgpu-monitor --gui and amdgpu-plot. It worked fine. The time indicated at the top of the plot window seems to be displaying in UTC. Is there a way to have it displayed in local time zone?
I will try the PyPI package tomorrow.
Thanks for checking it out! Looks like everything works. To use local time zone instead of UTC, just use the --ltz option.
A little more progress on the vext front.
Under virtualenv and current setuptools everything is installed correctly - except the pth that enables it.
You can fix this by running $ vext -e
I would like to build in distribution dependent behavior and need help determining distribution specific commands. This includes the method of determining which distribution is used and which command is used to determine if a package is installed.
lsb_release
,/etc/*-release
,/proc/version
, orhostnamectl
dpkg
for debianHere is the list of distributions that I am aware are being used:
Looking for feedback on distro behavior. Thanks!