ucsf-wynton / wyntonquery

R Package: wyntonquery - Query the UCSF Wynton Environment
https://ucsf-wynton.github.io/wyntonquery/
0 stars 0 forks source link

Gather GPU info #2

Open HenrikBengtsson opened 6 years ago

HenrikBengtsson commented 6 years ago

Background

Information on GPUs can be obtained from qconf, e.g.

$ qconf -se msg-iogpu3
hostname              msg-iogpu3
load_scaling          NONE
complex_values        mem_free=128000M
load_values           arch=lx-amd64,num_proc=32,mem_total=128739.226562M, \
[...]
                      np_load_medium=0.156875,np_load_long=0.159688, \
                      gpu.ncuda=2,gpu.ndev=2,gpu.cuda.0.mem_free=758054912, \
                      gpu.cuda.0.procs=1,gpu.cuda.0.clock=2025, \
                      gpu.cuda.0.util=57,gpu.cuda.1.mem_free=758054912, \
                      gpu.cuda.1.procs=1,gpu.cuda.1.clock=2025, \
                      gpu.cuda.1.util=54,gpu.names=GeForce GTX 1080;GeForce \
                      GTX 1080;
processors            32
[...]

Issue

The qconf command works only on the login nodes (which btw is clarified on https://ucsf-hpc.github.io/wynton/scheduler/gpu.html). This prevents us from calling qconf from R and the wyntonquery package, which in turn makes it much more tedious to automate the gathering of GPU info.

EDIT 2019-04-12: Just checked, qconf -se msg-iogpu3 now works on development nodes.

HenrikBengtsson commented 5 years ago

Note to self: Just re-checked, qconf -se msg-iogpu3 now works on development nodes. Dropped warning about that from https://ucsf-hpc.github.io/wynton/scheduler/gpu.html.

Next task is to gather GPU info and incorporate in the https://ucsf-hpc.github.io/wynton/assets/data/host_table.tsv and present on https://ucsf-hpc.github.io/wynton/about/specs.html.

HenrikBengtsson commented 5 years ago

Only works for NVidia cards, but an alternative is to parse:

$ cat /proc/driver/nvidia/gpus/*/information 
Model:       GeForce GTX 980 Ti
IRQ:         60
GPU UUID:    GPU-e3126112-7d81-e65c-0392-019611412abb
Video BIOS:      84.00.41.00.90
Bus Type:    PCIe
DMA Size:    40 bits
DMA Mask:    0xffffffffff
Bus Location:    0000:84:00.0
Device Minor:    0
Blacklisted:     No