Ricks-Lab / gpu-utils

A set of utilities for monitoring and customizing GPU performance
GNU General Public License v3.0
133 stars 23 forks source link
amdgpu boinc einsteinathome gpu-computing gpu-monitoring gpu-settings gpu-utils linux milkyway overclock python3 setiathome

Ricks-Lab GPU Utilities

GitHub commit activity GitHub last commit Libraries.io SourceRank

rickslab-gpu-utils

A set of utilities for monitoring GPU performance and modifying control settings.

In order to get maximum capability of these utilities, you should be running with a kernel that provides support of the GPUs you have installed. If using AMD GPUs, installing the latest amdgpu driver or ROCm package, may provide additional capabilities. If you have Nvidia GPUs installed, you should have nvidia-smi installed in order for the utility reading of the cards to be possible. Writing to GPUs is currently only possible for compatible AMD GPUs on systems with appropriate kernel version with the AMD ppfeaturemask set to enable this capability as described here.

Installation

There are 4 methods of installation available and are summarized here: If you get a key expired message during apt update, try updating the project PUBLIC.KEY with the following command:

wget -q -O - https://debian.rickslab.com/PUBLIC.KEY | sudo gpg --dearmour -o /usr/share/keyrings/rickslab-agent.gpg

User Guide

For a detailed introduction, a community sourced User Guide is available. All tools are demonstrated and use cases are presented. Additions to the guide are welcome. Please submit a pull request with your suggested additions!

Commands

A summary of command line tools available in rickslab-gpu-utils follows. Additional details are available in man pages and the User Guide.

gpu-chk

This utility verifies if the user's environment is compatible with rickslab-gpu-utils.

gpu-ls

This utility displays most relevant parameters for installed and compatible GPUs. The default behavior is to list relevant parameters by GPU. OpenCL platform information is added when the --clinfo option is used. A brief listing of key parameters is available with the --short command line option. A simplified table of current GPU state is displayed with the --table option. The --no_fan can be used to ignore fan settings. The --pstate option can be used to output the p-state table for each GPU instead of the list of basic parameters. The --ppm option is used to output the table of available power/performance modes instead of basic parameters. The --features option is used to output the table of amdgpu pp features and their status instead of basic parameters. The --force_all results in an attempt to read all possible sensors, regardless of how the GPU is classified. The --raw will read all possible driver files and display with indicators of if a gpu-util key word and description is associated with each file along with its contents. The --verbose option will display progress and informational messages generated by the utilities. By default, output data is formatted and color coded, so the --no_markup option can be specified to get plain text.

gpu-mon

A utility to give the current state of all compatible GPUs. The default behavior is to continuously update a text based table in the current window until Ctrl-C is pressed. With the --gui option, a table of relevant parameters will be updated in a Gtk window. You can specify the delay between updates with the --sleep N option where N is an integer > zero that specifies the number of seconds to sleep between updates. The --no_fan option can be used to disable the reading and display of fan information. The --log option is used to write all monitor data to a psv log file. When writing to a log file, the utility will indicate this in red at the top of the window with a message that includes the log file name. The --plot will display a plot of critical GPU parameters which updates at the specified --sleep N interval. If you need both the plot and monitor displays, then using the --plot option is preferred over running both tools as a single read of the GPUs is used to update both displays. The --ltz option results in the use of local time instead of UTC. The --verbose option will display progress and informational messages generated by the utilities.

gpu-plot

A utility to continuously plot the trend of critical GPU parameters for all compatible GPUs. The --sleep N can be used to specify the update interval. The gpu-plot utility has 2 modes of operation. The default mode is to read the GPU driver details directly, which is useful as a standalone utility. The --stdin option causes gpu-plot to read GPU data from stdin. This is how gpu-mon produces the plot and can also be used to pipe your own data into the process. The --simlog option can be used with the --stdin when a monitor log file is piped as stdin. This is useful for troubleshooting and can be used to display saved log results. The --ltz option results in the use of local time instead of UTC. If you plan to run both gpu-plot and gpu-mon, then the --plot option of the gpu-mon utility should be used instead of both utilities in order reduce data reads by a factor of 2. The --verbose option will display progress and informational messages generated by the utilities.

gpu-pac

Program and Control compatible GPUs with this utility. By default, the commands to be written to a GPU are written to a bash file for the user to inspect and run. If you have confidence, the --execute_pac option can be used to execute and then delete the saved bash file. Since the GPU device files are writable only by root, sudo is used to execute commands in the bash file, as a result, you will be prompted for credentials in the terminal where you executed gpu-pac. The --no_fan option can be used to eliminate fan details from the utility. The --force_write option can be used to force all configuration parameters to be written to the GPU. The default behavior is to only write changes. The --verbose option will display progress and informational messages generated by the utilities.

New in Current Release - v3.9.0

Development Plans

Known Issues

References

History

New in Previous Version - v3.8.4

New in Previous Release - v3.8.3

New in Previous Release - v3.8.2

New in Previous Release - v3.8.0

New in Previous Release - v3.7.8

New in Previous Release - v3.7.7

New in Previous Release - v3.7.6

New in Previous Release - v3.7.5

New in Previous Release - v3.7.4

New in Previous Release - v3.7.3

New in Previous Release - v3.7.2

New in Previous Release - v3.7.1

New in Previous Release - v3.7.0

New in Previous Release - v3.6.2

New in Previous Release - v3.6.1

New in Previous Release - v3.6.0

New in Previous Release - v3.5.10

New in Previous Release - v3.5.9

New in Previous Release - v3.5.8

New in Previous Release - v3.5.7

New in Previous Release - v3.5.6

New in Previous Release - v3.5.5

New in Previous Release - v3.5.0

New in Previous Release - v3.3.14

New in Previous Release - v3.2.0

New in Previous Release - v3.0.0

New in Previous Release - v2.7.0

New in Previous Release - v2.6.0

New in Previous Release - v2.5.2

New in Previous Release - v2.5.1

New in Previous Release - v2.5.0

New in Previous Release - v2.4.0

New in Previous Release - v2.3.1

New in Previous Release - v2.3.0

New in Previous Release - v2.2.0

New in Previous Release - v2.1.0

New in Previous Release - v2.0.0

New in Previous Release - v1.1.0

New in Previous Release - v1.0.0