Closed polarathene closed 2 weeks ago
Thanks to the help from the Rust community on Discord, I was given guidance on how to use the FFI call lib().cuDeviceGetName()
.
cudarc
could provide a safe method to call which simplifies getting the device name and compute capability. I don't know how useful vmm
capability is, but I figure the other two might be common enough to want easy access to?use cudarc::driver::CudaDevice;
use cudarc::driver::sys::{
lib,
cudaError_enum,
CUdevice_attribute_enum as Attribute,
};
// unsafe call needs to `CStr` to convert buffer into native `String` type:
use std::ffi::CStr;
// Simplify error handling:
use anyhow::Result;
fn main() -> Result<()> {
let dev = CudaDevice::new(0)?;
let device_index = dev.ordinal();
let device_name = get_device_name(&dev)?;
let (major, minor) = get_compute_capability(&dev)?;
let supports_vmm = has_vmm_support(&dev)?;
// Device 0: NVIDIA GeForce RTX 4060 Laptop GPU, compute capability 8.9, VMM: true
println!("Device {device_index}: {device_name}, compute capability: {major}.{minor}, VMM: {supports_vmm}");
Ok(())
}
fn get_device_name(dev: &CudaDevice) -> Result<String> {
// A buffer with sufficient size to store the string
let mut buffer = vec![0u8; 64];
// These unsafe methods require the `Lib` struct, get a static ref via `sys::lib()`:
let result = unsafe { lib().cuDeviceGetName(
buffer.as_mut_ptr() as *mut i8, // <-- `name` expects mutable pointer to `buffer`
buffer.capacity() as i32, // <-- `len` expects capacity of the `buffer`
*dev.cu_device() // <-- `dev` requires to deref the returned `&CUdevice`
)};
// `CUresult` enum returned, verify operation was successful
// and then return a `String` (_requires converting `buffer` to `CStr` => `&str` => `String`_)
match result {
cudaError_enum::CUDA_SUCCESS => {
let device_name: String = CStr::from_bytes_until_nul(buffer.as_slice())?
.to_str()?
.to_owned();
Ok(device_name)
}
_ => anyhow::bail!("Failed to query name of device: {}", dev.ordinal()),
}
}
fn get_compute_capability(dev: &CudaDevice) -> Result<(u8, u8)> {
let attr_major = Attribute::CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR;
let attr_minor = Attribute::CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR;
match (dev.attribute(attr_major), dev.attribute(attr_minor)) {
(Ok(major), Ok(minor)) => Ok((major as u8, minor as u8)),
_ => anyhow::bail!("Failed to query compute capability of device: {}", dev.ordinal()),
}
}
fn has_vmm_support(dev: &CudaDevice) -> Result<bool> {
let attr_vmm = Attribute::CU_DEVICE_ATTRIBUTE_VIRTUAL_ADDRESS_MANAGEMENT_SUPPORTED;
// i32 result, assume anything not 0 as `true`:
Ok(dev.attribute(attr_vmm)? != 0)
}
[package]
name = "example-device-info"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow = "1.0.86"
cudarc = { version = "0.11.4", features = ["cuda-12040"] }
$ cargo run
Device 0: NVIDIA GeForce RTX 4060 Laptop GPU, compute capability: 8.9, VMM: true
For reference ArrayFire has a similar set of API calls, although a bit opinionated. They also appear to have chosen a buffer length of 64 bytes for the device name 👍
I suppose ArrayFire may be more similar to Candle, so let me know if I should instead raise a request there for their cuda backend to implement similar to above.
With
llama.cpp
it outputs information to better identify my device (NVIDIA GeForce RTX 4060
) and it's compute capability (8.9
):Is this something the crate could provide a safe API for? Presently it looks like I'd have to delve into the
unsafe
API, but this is foreign to me.The compute capability can be accessed easily enough through
CudaDevice::attribute()
(upstreamcuDeviceGetAttribute
) via theCUdevice_attribute_enum
I assume the relevant call for the device name is this one (
cuDeviceGetName
)?:It seems I should use
cu_device()
?Then somehow figure out how to call
sys::Lib::cuDeviceGetName()
and the parameters it wants?: