Gdev is a rich set of open-source software for NVIDIA GPGPU technology, containing device drivers, CUDA runtimes, CUDA/PTX compilers, and some utility tools. Currently it only supports NVIDIA GPUs and Linux but is, by design, portable to other GPUs and platforms as well. The supported API implementaions include:
The implementation of CUDA Driver API and CUDA Runtime API is built on top of Gdev API. For CUDA Runtime API we make use of GPU Ocelot as a front-end implementation. You can add your favorite high-level API to Gdev other than CUDA Driver/Runtime APIs on top of Gdev API.
Gdev provides runtime support in both the device driver and the user- space library. Device-driver runtime support is a unique feature of Gdev while most existing GPGPU programming frameworks take user-space approaches. With device-driver runtime support, Gdev allows the OS to manage GPUs as first-class citizens and execute CUDA programs itself. Gdev's user-space runtime support is also unique in a sense that it is available for multiple open-source and proprietary device drivers. The supported device drivers include:
To summarize, Gdev offers the following advantages:
The recommended way to build/install Gdev is building/installing it with CMake.
Otherwise, you can choose one of the following for what driver to be used (obsolete):
Once the driver is successfully installed, you can install high-level API:
NVIDIA's graphics cards are set very low clocks by default. To get
performance, you need to reclock your card at the maximum level.
How? Be root first, and then echo 3
to the following file:
echo 3 > /sys/class/drm/card0/device/performance_level
You can downclock your card by echoing 0
to the same file, i.e.,
echo 0 > /sys/class/drm/card0/device/performance_level
There are middleground levels 1
and 2
, too. Note that Reclocking
is not completely supported by the open-source solution yet.
There are still some performance levels missing, and hence you may
not get as high performance as the blob. If you really need the same
level of performance as the blob, you can run some long-running
CUDA program with the blob, and do kexec -f your kernel
before the
program is finished. Then the clock remains at the maximum level.
Today many CUDA programs are written using CUDA Runtime API. If you want to test CUDA Driver API, try the following benchmarks and apps.
git@github.com:shinpei0208/gdev-app.git
git@github.com:shinpei0208/gdev-bench.git
Copyright (C) Shinpei Kato
Nagoya University
Parallel and Distributed Systems Lab (PDSL)
http://pdsl.jp
University of California, Santa Cruz
Systems Research Lab (SRL)
http://systems.soe.ucsc.edu
All Rights Reserved.