Ricks-Lab / benchMT

SETI multi-threaded MB/AP Benchmark Tool
GNU General Public License v3.0
3 stars 1 forks source link

Something changed in lspci and the grep is failing is my guess #5

Closed JStateson closed 4 years ago

JStateson commented 4 years ago

Ubuntu 18.04 lscpi version unknown (no -version argument) from

lspci | grep -E \"^.*(VGA|Display).*\[AMD\/ATI\].*$\" | grep -Eo \"^([0-9a-fA-F]+:[0-9a-fA-F]+.[0-9a-fA-F])\"

to


lspci | grep -E \"^.*(VGA|Display).*$\" | grep -Eo \"^([0-9a-fA-F]+:[0-9a-fA-F]+.[0-9a-fA-F])\"

removing the ATI and AMD fixed the problem of the first grep feeding "null" into the second

Ricks-Lab commented 4 years ago

@JStateson Sorry about the late response to this. I think the notification came during travels so I missed it. I think this problem is associated with AMD only energy measurements. I will need some time to dig into and implement a fix.

JStateson commented 4 years ago

When I was in undergrad (century ago?) It was fun to see who could write the shortest program to translate Morse code. I thought my 3 line program was good, but the instructor showed us his 1 line program in APL that did the trick. Unlike your grep above, his APL was understandable to me.

Ricks-Lab commented 4 years ago

When I was in undergrad (century ago?) It was fun to see who could write the shortest program to translate Morse code. I thought my 3 line program was good, but the instructor showed us his 1 line program in APL that did the trick. Unlike your grep above, his APL was understandable to me.

The grep statement does 3 things: It looks for all GPU's and then selects only AMD GPU's from those results and then gets the PCIe ID from the final results. Maybe it would be better to do this in 3 steps. Originally, I only intended to run this when the --energy option is used which is only applicable for AMD at this time, but then pulled it early in the flow to determine devmap which maps boinc device numbers to linux card numbers. I will work on this with the plan of eventually including energy measurements for NVidia. Can you help to provide the output of the grep for NVidia GPUs? lspci | grep -E \"^.*(VGA|Display).*$\"

The code on master has already been modified to work correctly, but still want to make the longer term improvements.

JStateson commented 4 years ago

following did nothing

jstateson@h110btc:~$ lspci | grep -E \"^.*(VGA|Display).*$\"
jstateson@h110btc:~$

this was what it had to work with on H110BTC with 18.04 and 9 NV GPUs and 1 Intel. Note that two of the nvidia are designated as 3d and not VGA.

jstateson@h110btc:~$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 07)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 07)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
00:14.0 USB controller: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem (rev 31)
00:16.0 Communication controller: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 (rev 31)
00:17.0 SATA controller: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #5 (rev f1)
00:1c.6 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #7 (rev f1)
00:1c.7 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #8 (rev f1)
00:1d.0 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 (rev f1)
00:1d.1 PCI bridge: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #10 (rev f1)
00:1f.0 ISA bridge: Intel Corporation H110 Chipset LPC/eSPI Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller (rev 31)
00:1f.3 Audio device: Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller (rev 31)
00:1f.4 SMBus: Intel Corporation 100 Series/C230 Series Chipset Family SMBus (rev 31)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V (rev 31)
01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
02:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
03:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
04:00.0 3D controller: NVIDIA Corporation GP106 [P106-100] (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
05:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
06:00.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:01.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:02.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:03.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:04.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:05.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:06.0 PCI bridge: ASMedia Technology Inc. Device 1187
07:07.0 PCI bridge: ASMedia Technology Inc. Device 1187
08:00.0 3D controller: NVIDIA Corporation GP106 [P106-090] (rev a1)
0a:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
0a:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
0b:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
0b:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)
0e:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
0e:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)

on an 18.04 with three AMD boards lscpi showed the following

jstateson@jysdualxeon:~$ lspci
00:00.0 Host bridge: Intel Corporation 5500 I/O Hub to ESI Port (rev 22)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22)
00:09.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 22)
00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22)
00:14.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub System Management Registers (rev 22)
00:14.1 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22)
00:14.2 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22)
00:14.3 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle Registers (rev 22)
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 5
00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 6
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580]
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580]
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
04:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580]
06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
08:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)
fe:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers (rev 02)
fe:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder (rev 02)
fe:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02)
fe:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 (rev 02)
fe:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 0 (rev 02)
fe:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 1 (rev 02)
fe:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02)
fe:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 (rev 02)
fe:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers (rev 02)
fe:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder (rev 02)
fe:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller RAS Registers (rev 02)
fe:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers (rev 02)
fe:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control (rev 02)
fe:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address (rev 02)
fe:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank (rev 02)
fe:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control (rev 02)
fe:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control (rev 02)
fe:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address (rev 02)
fe:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank (rev 02)
fe:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control (rev 02)
fe:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control (rev 02)
fe:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address (rev 02)
fe:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank (rev 02)
fe:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control (rev 02)
ff:00.0 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture Generic Non-core Registers (rev 02)
ff:00.1 Host bridge: Intel Corporation Xeon 5600 Series QuickPath Architecture System Address Decoder (rev 02)
ff:02.0 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 0 (rev 02)
ff:02.1 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 0 (rev 02)
ff:02.2 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 0 (rev 02)
ff:02.3 Host bridge: Intel Corporation Xeon 5600 Series Mirror Port Link 1 (rev 02)
ff:02.4 Host bridge: Intel Corporation Xeon 5600 Series QPI Link 1 (rev 02)
ff:02.5 Host bridge: Intel Corporation Xeon 5600 Series QPI Physical 1 (rev 02)
ff:03.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Registers (rev 02)
ff:03.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Target Address Decoder (rev 02)
ff:03.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller RAS Registers (rev 02)
ff:03.4 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Test Registers (rev 02)
ff:04.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Control (rev 02)
ff:04.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Address (rev 02)
ff:04.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Rank (rev 02)
ff:04.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 0 Thermal Control (rev 02)
ff:05.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Control (rev 02)
ff:05.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Address (rev 02)
ff:05.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Rank (rev 02)
ff:05.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 1 Thermal Control (rev 02)
ff:06.0 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Control (rev 02)
ff:06.1 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Address (rev 02)
ff:06.2 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Rank (rev 02)
ff:06.3 Host bridge: Intel Corporation Xeon 5600 Series Integrated Memory Controller Channel 2 Thermal Control (rev 02)
Ricks-Lab commented 4 years ago

@JStateson Thanks for providing the details! This will help in the development in support of HW that I don't have. I see you have some experience with BOINC, so I have a question. It would be very useful if I can associate a BOINC device number with a Linux card number, but I have not been able to find details on how BOINC assigns the number. So far from observations, it appears to be the reverse of Linux card numbers, but my few systems is not enough to know this for sure. Let me know if you have insight into this else, maybe I can produce output that can be validated on your systems.

JStateson commented 4 years ago

I have spent some time looking at this myself and have not figured it out.

The problem I was trying to solve was which board was causing a computation failure when there are multiple identical GPUs. The solution I had been using was to manually stop the GPU fan from moving and see which board showed a temperature increase. This is obviously less than ideal but it does work both in window an linux using boinctasks capability of displaying temperatures. I would like to do this programatically.

What I have learned: Boinc does not ask the device manager (windows) or the kernel (linux) to enumerate the video boards. Instead, Boinc runs a gpu detect app that uses cuda (nvidia only) and opencl (nvidia, ati , intel) to interrigate the boards and write out what is found. The file is named coproc_info.xml and it is read back in by boinc to "see what is there". Boinc sorts what is reported in descending order based on FLOPS such that d0 is the best board and d1, d2, etc, are in decreasing flops. This method is generally correct and more often than not, d0 is the best board.

Nvidia's cuda reports busid of 1,,6 for 6 boards but their opencl package shows a "opencl_driver_index" 0..5

ATI shows opencl_driver_index also starting at 0

The module that reads in the info and does the sorting do not have an entry in the C++ structure such as "bus id" or "driver index" nor even the name of the board such as gtx1660ti etc. All that is lost once the gpu_detect returns.

Compounding the problem is the numbering of the board by the nvidia driver. nvidia-smi list boards using 0..5 (for 6 boards) Note that gpu#1 is gtx1660 below

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0 Off |                  N/A |
| 99%   42C    P2    97W / 151W |   1483MiB /  8116MiB |     87%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 166...  Off  | 00000000:02:00.0 Off |                  N/A |
|100%   58C    P2    96W / 120W |   1332MiB /  5944MiB |     91%      Default |
+-------------------------------+----------------------+----------------------+

but coproc_info shows bus id of "2"

<coproc_cuda>
   <count>1</count>
   <name>GeForce GTX 1660 Ti</name>
 ...
<pci_info>
   <bus_id>2</bus_id>
   <device_id>0</device_id>
   <domain_id>0</domain_id>
</pci_info>

but also device id of "0"

   <nvidia_opencl>
      <name>GeForce GTX 1660 Ti</name>
      <vendor>NVIDIA Corporation</vendor>

      <device_num>0</device_num>
      <peak_flops>5529600000000.000000</peak_flops>
      <opencl_available_ram>4164943872.000000</opencl_available_ram>
      <opencl_device_index>0</opencl_device_index>
      <warn_bad_cuda>0</warn_bad_cuda>
   </nvidia_opencl>

The value of 0 for opencl ID does not correspond to the nvidia-smi table but the bus id of 2 does seem to match that table. Unfortunately, this fails when multiplexing a slot. If a 4-in-1 riser is used in a slot, the number are no longer 1..6 instead there is a jump to (for example) 12 and a renumbering of the boards that are "after" the slot the 4-in-1 riser was in.

The net effect of all this is the GPUs: d0, d1, d2, etc associated with work units cannot easily be matched back to the board or slot the board is in.

Some ideas I was looking at:

  1. Command the client to upload the coproc_info.xml file for analysis
  2. Mod the client to store bus id, driver index and board name in a structure for access
  3. send the bus id of a failing board to the manager so it can remove that board from the pool of available boards. ie: when the driver signals "Unable to determine the device handle for GPU 0000:01:00.0: GPU is lost. Reboot the system to recover this GPU*" the value of 0000:01:00.0 can be translated somehow to "d3" and "d3" will no longer be assigned tasks. I brought this up as a talking point here https://forum.efmer.com/index.php?topic=1394.msg8047#msg8047 and down at the bottom of here https://github.com/BOINC/boinc/issues/2993

EDIT-url to efmer was corrected

JStateson commented 4 years ago

There does not appear to be any unique identifying serial number on any NVidia board. I recall reading somewhere that the manufactures decided long ago not to put a serial number in as that might be used to prevent software from working on a replacement board unless a "fee" was paid to the sw developer. I had the idea of re-flashing the bios and increment the "date" so as to be able to identify which board had a problem. Another thought was to run a small "performance" test under direction of the gpu_detect module and have a program on the Linux or windows remote system determine which one of the 6, 8 or 19 GPUs was the one running the test.

Ricks-Lab commented 4 years ago

For now, I am rewriting the part of the code the generates the list of GPUs. Originally, I used lshw but later I added the capability to estimate energy used, so I had to take parts from my amdgpu-utils and use lspci and driver files, but this was just added on to what was already there. I plan to make the new GPU list the core of how benchMT uses GPU compute resources. This should make further improvements much easier.

To make the association in the past, I have used the benchMT command line option to run on a specific device and use amdgpu-utils to see which card number has loading and build a devmap which I stored in the benchCFG file. I hope a more generic implementation would allow it to work with other than AMD GPUs. It will take some time for the rewrite...

Ricks-Lab commented 4 years ago

@JStateson Since I only have AMD GPUs, can you help to collect output of lspci for nvidia and intel GPUs? lspci -k -s 43:00.0 where 43:00.0 is replaced by the pcie id for your cards?

KeithMyers commented 4 years ago

There does not appear to be any unique identifying serial number on any NVidia board. I recall reading somewhere that the manufactures decided long ago not to put a serial number in as that might be used to prevent software from working on a replacement board unless a "fee" was paid to the sw developer. I had the idea of re-flashing the bios and increment the "date" so as to be able to identify which board had a problem. Another thought was to run a small "performance" test under direction of the gpu_detect module and have a program on the Linux or windows remote system determine which one of the 6, 8 or 19 GPUs was the one running the test.

As far as I know . . . . EVERY Nvidia card gets an ID. EVGA has serial number stickers on the back of every card for example. Also, every gpu in the system gets an unique GPU UUID that is a 32 bit hexadecimal number.

Ricks-Lab commented 4 years ago

There does not appear to be any unique identifying serial number on any NVidia board. I recall reading somewhere that the manufactures decided long ago not to put a serial number in as that might be used to prevent software from working on a replacement board unless a "fee" was paid to the sw developer. I had the idea of re-flashing the bios and increment the "date" so as to be able to identify which board had a problem. Another thought was to run a small "performance" test under direction of the gpu_detect module and have a program on the Linux or windows remote system determine which one of the 6, 8 or 19 GPUs was the one running the test.

As far as I know . . . . EVERY Nvidia card gets an ID. EVGA has serial number stickers on the back of every card for example. Also, every gpu in the system gets an unique GPU UUID that is a 32 bit hexadecimal number.

I know for AMD there is a unique_id device file that returns a hex number, but I don't think it is useful in mapping between boinc device number and linux card number.

Ricks-Lab commented 4 years ago

@JStateson @KeithMyers I have just posted a test version of benchMT on master. It is not completely functional and will just display a list of GPUs with some device details. Can you run and post results here?

Also, I am trying to figure out the hwmon file that can be used to read current power. Maybe the name of a file in the card's hwmon directory will be obvious. If so, please cat and send me details.

Thanks!

JStateson commented 4 years ago

@JStateson Since I only have AMD GPUs, can you help to collect output of lspci for nvidia and intel GPUs? lspci -k -s 43:00.0 where 43:00.0 is replaced by the pcie id for your cards?

jstateson@jysdualxeon:~$ lspci -k | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
08:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)

jstateson@tb85-nvidia:~$ lspci -k | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 2182 (rev a1)
06:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)

jjstateson@tb85-nvidia:~$ lspci -k | grep 3D
03:00.0 3D controller: NVIDIA Corporation GP102 [P102-100] (rev a1)
04:00.0 3D controller: NVIDIA Corporation GP102 [P102-100] (rev a1)
05:00.0 3D controller: NVIDIA Corporation GP102 [P102-100] (rev a1)

jstateson@h110btc:~$ lspci -k | grep VGA
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
0a:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
0b:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
0e:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)

jstateson@h110btc:~$ lspci -k | grep 3D
04:00.0 3D controller: NVIDIA Corporation GP106 [P106-100] (rev a1)
08:00.0 3D controller: NVIDIA Corporation GP106 [P106-090] (rev a1)

willl add the -s later for you

JStateson commented 4 years ago
jstateson@h110btc:~$ lspci -k -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
        Subsystem: ASRock Incorporation HD Graphics 530
        Kernel driver in use: i915
        Kernel modules: i915
jstateson@h110btc:~$ lspci -k -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1)
        Subsystem: eVga.com. Corp. GP106 [GeForce GTX 1060 6GB]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 02:00.0
02:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: Device 196e:11da
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 03:00.0
03:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: eVga.com. Corp. GP106 [GeForce GTX 1060 3GB]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 05:00.0
05:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
        Subsystem: ZOTAC International (MCO) Ltd. GP104 [GeForce GTX 1070]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 0a:00.0
0a:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: eVga.com. Corp. GP106 [GeForce GTX 1060 3GB]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 0b:00.0
0b:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: Device 196e:11da
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 0e:00.0
0e:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 3GB] (rev a1)
        Subsystem: eVga.com. Corp. GP106 [GeForce GTX 1060 3GB]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 04:00.0
04:00.0 3D controller: NVIDIA Corporation GP106 [P106-100] (rev a1)
        Subsystem: ZOTAC International (MCO) Ltd. GP106 [P106-100]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
jstateson@h110btc:~$ lspci -k -s 08:00.0
08:00.0 3D controller: NVIDIA Corporation GP106 [P106-090] (rev a1)
        Subsystem: ZOTAC International (MCO) Ltd. GP106 [P106-090]
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
JStateson commented 4 years ago
jstateson@jysdualxeon:~$ lspci -k -s 01:00.0
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
        Subsystem: Gigabyte Technology Co., Ltd Radeon RX 570 Gaming 4G
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
jstateson@jysdualxeon:~$ lspci -k -s 03:00.0
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
        Subsystem: Gigabyte Technology Co., Ltd Radeon RX 570 Gaming 4G
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
jstateson@jysdualxeon:~$ lspci -k -s 04:00.0
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] (rev ef)
        Subsystem: Gigabyte Technology Co., Ltd Radeon RX 570 Gaming 4G
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu
jstateson@jysdualxeon:~$ lspci -k -s 08:01.0
08:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)
        Subsystem: Super Micro Computer Inc MGA G200eW WPCM450
        Kernel driver in use: mgag200
        Kernel modules: mgag200
jstateson@jysdualxeon:~$
Ricks-Lab commented 4 years ago

Is the last one on the list a server vga card with no compute capabilities? 08:01.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)

Ricks-Lab commented 4 years ago

It looks like my assumptions are correct. It would be great if you could run the latest test version on benchMT on master. It will exit after displaying GPU information.

JStateson commented 4 years ago

yea that was builtin no comute unlike the intel 530

JStateson commented 4 years ago

will do later, thanks!

JStateson commented 4 years ago

@JStateson @KeithMyers I have just posted a test version of benchMT on master. It is not completely functional and will just display a list of GPUs with some device details. Can you run and post results here? Also, I am trying to figure out the hwmon file that can be used to read current power. Maybe the name of a file in the card's hwmon directory will be obvious. If so, please cat and send me details. Thanks!

https://stateson.net/images/h110btc_benchMT.txt

https://stateson.net/images/tb85_benchMT.txt

https://stateson.net/images/dualxeon_benchMT.txt

The only obvious hits were in the system with the AMD cards

root@jysdualxeon:/# find . -name "hwmon"
./sys/kernel/debug/tracing/events/hwmon
./sys/class/hwmon
./sys/devices/platform/coretemp.1/hwmon
./sys/devices/platform/coretemp.0/hwmon
./sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon
./sys/devices/pci0000:00/0000:00:14.3/hwmon
./sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/hwmon
./sys/devices/pci0000:00/0000:00:09.0/0000:04:00.0/hwmon

took a while but I navigate to the first AMD card and got a directory listing

jstateson@jysdualxeon:/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon3$ ls -l
total 0
lrwxrwxrwx 1 root root    0 Jan 19 12:21 device -> ../../../0000:01:00.0
-rw-r--r-- 1 root root 4096 Jan 19 13:08 fan1_enable
-r--r--r-- 1 root root 4096 Jan 19 12:21 fan1_input
-r--r--r-- 1 root root 4096 Jan 19 12:21 fan1_max
-r--r--r-- 1 root root 4096 Jan 19 12:21 fan1_min
-rw-r--r-- 1 root root 4096 Jan 19 13:08 fan1_target
-r--r--r-- 1 root root 4096 Jan 19 13:08 freq1_input
-r--r--r-- 1 root root 4096 Jan 19 13:08 freq1_label
-r--r--r-- 1 root root 4096 Jan 19 13:08 freq2_input
-r--r--r-- 1 root root 4096 Jan 19 13:08 freq2_label
-r--r--r-- 1 root root 4096 Jan 19 12:21 in0_input
-r--r--r-- 1 root root 4096 Jan 19 12:21 in0_label
-r--r--r-- 1 root root 4096 Jan 19 12:21 name
drwxr-xr-x 2 root root    0 Jan 19 12:53 power
-r--r--r-- 1 root root 4096 Jan 19 12:21 power1_average
-rw-r--r-- 1 root root 4096 Jan 19 12:21 power1_cap
-r--r--r-- 1 root root 4096 Jan 19 13:08 power1_cap_max
-r--r--r-- 1 root root 4096 Jan 19 13:08 power1_cap_min
-rw-r--r-- 1 root root 4096 Jan 19 13:08 pwm1
-rw-r--r-- 1 root root 4096 Jan 19 13:08 pwm1_enable
-r--r--r-- 1 root root 4096 Jan 19 13:08 pwm1_max
-r--r--r-- 1 root root 4096 Jan 19 13:08 pwm1_min
lrwxrwxrwx 1 root root    0 Jan 19 12:21 subsystem -> ../../../../../../class/hwmon
-r--r--r-- 1 root root 4096 Jan 19 12:21 temp1_crit
-r--r--r-- 1 root root 4096 Jan 19 12:21 temp1_crit_hyst
-r--r--r-- 1 root root 4096 Jan 19 12:21 temp1_input
-r--r--r-- 1 root root 4096 Jan 19 12:21 temp1_label
-rw-r--r-- 1 root root 4096 Jan 19 12:20 uevent
jstateson@jysdualxeon:/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon3$ cat name
amdgpu

did not find anything similar for the NVidia systems

root@h110btc:/# find . -name "hwmon"
./usr/src/linux-headers-5.0.0-36/drivers/hwmon
./usr/src/linux-headers-5.3.0-26-generic/include/config/hwmon
./usr/src/linux-headers-5.0.0-36-generic/include/config/hwmon
./usr/src/linux-headers-5.0.0-37/drivers/hwmon
./usr/src/linux-headers-5.0.0-37-generic/include/config/hwmon
./usr/src/linux-headers-5.3.0-26/drivers/hwmon
find: ‘./proc/1795/task/1795/net’: Invalid argument
find: ‘./proc/1795/net’: Invalid argument
./lib/modules/5.3.0-26-generic/kernel/drivers/hwmon
./lib/modules/5.0.0-37-generic/kernel/drivers/hwmon
find: ‘./run/user/1000/gvfs’: Permission denied
./sys/kernel/debug/tracing/events/hwmon
./sys/class/hwmon
./sys/devices/platform/coretemp.0/hwmon

Poking around on both NVidia systems shows only core (cpu) info at any hwmon folder. Maybe something needs to be installed? Unlike the dualxeon, both of the NVidia systems are missing intel cpu frequency settings

jstateson@tb85-nvidia:~/temp$ sudo ./chg_intel_freq.sh
have to enter a frequency. Available frequencies are:
cat: /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies: No such file or directory

that file is missing

The xeon shows the following
jstateson@jysdualxeon:~/bt_bin$ sudo ./chg_freq.sh
have to enter a frequency. Available frequencies are:
3068000 3067000 2933000 2800000 2667000 2533000 2400000 2267000 2133000 2000000 1867000 1733000 1600000

so I suspect something was not installed on the NVidia systems but I dont know what it was. I needed to be able to step the frequency down on the xeon as even with watercooling during the summer it overheated.

KeithMyers commented 4 years ago

Don't really see anything useful on my host with Nvidia cards.

Set specified gpu_devices: [0, 1, 2] GPU_ITEM: uuid: a3096ba3d91646389037d9478b791f70 pcie_id: 08:00.0 model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia card number: 0 BOINC Device number: -1 card path: /sys/class/drm/card0/device hwmon path: None Compute compatible: True Energy compatible: False GPU_ITEM: uuid: eea5ffb9069e48859f4a81ba5ed9f302 pcie_id: 0a:00.0 model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia card number: 1 BOINC Device number: -1 card path: /sys/class/drm/card1/device hwmon path: None Compute compatible: True Energy compatible: False GPU_ITEM: uuid: 6e3be3488c1b44ae95c49e7ccb8e629e pcie_id: 0b:00.0 model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia card number: 2 BOINC Device number: -1 card path: /sys/class/drm/card2/device hwmon path: None Compute compatible: True Energy compatible: False

JStateson commented 4 years ago

Keith: how did you get that info?

Ricks-Lab commented 4 years ago

@JStateson @KeithMyers Thanks for posting all of the details. This makes things much clearer for me. Seems like nvidia implementation is so much different. I wonder if this is why Torvalds complains about them! But at least I now have an easy way to find if a GPU can support Energy measurements. Perhaps there is another way, like nvidia-smi. Do you know if there is a command line argument to give power for a given pcie_id or card number?

I have implemented a --lsgpu option for the benchMT currently on master. This will just display the GPU details and exit. It requires that clinfo is installed to get full details. It would be interesting to see your results posted. Do either of you know if openCL exists in parallel with CUDA or are they installed separately?

I think the next step is to see if there a predictable association between boinc device number and Linux card number. I have manually mapped them by running benchMT with a specified device and monitor the cards with another app. If you have some time to do this, please let me know your results.

JStateson commented 4 years ago

from clinfo https://stateson.net/images/tb85_clinfo.txt

jstateson@tb85-nvidia:~/Projects/benchMT$ ./benchMT --lsgpu
{}
GPU_ITEM: uuid: 1b1fc994708748ba88a84d220199064e
      pcie_id: 01:00.0
      model: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 0
      BOINC Device number: -1
      card path: /sys/class/drm/card0/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
GPU_ITEM: uuid: 86a5915084774475b0073277c21d4e71
      pcie_id: 02:00.0
      model: NVIDIA Corporation Device 2182 (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 1
      BOINC Device number: -1
      card path: /sys/class/drm/card1/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
GPU_ITEM: uuid: 7cf82b55d9194805ba0b469cf7dbe7de
      pcie_id: 03:00.0
      model: NVIDIA Corporation GP102 [P102-100] (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 2
      BOINC Device number: -1
      card path: /sys/class/drm/card2/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
GPU_ITEM: uuid: 8ed4e584a7af4d3fba60024e64c23307
      pcie_id: 04:00.0
      model: NVIDIA Corporation GP102 [P102-100] (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 3
      BOINC Device number: -1
      card path: /sys/class/drm/card3/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
GPU_ITEM: uuid: fbc482e8faa543f18bf9b491887f329a
      pcie_id: 05:00.0
      model: NVIDIA Corporation GP102 [P102-100] (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 4
      BOINC Device number: -1
      card path: /sys/class/drm/card4/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
GPU_ITEM: uuid: 9f392b9710b34a11bd45e8654ad3dc4a
      pcie_id: 06:00.0
      model: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
      vendor: NVIDIA
      driver: nvidiafb, nouveau, nvidia_drm, nvidia
      openCL Device: None
      openCL Version: None
      card number: 5
      BOINC Device number: -1
      card path: /sys/class/drm/card5/device
      hwmon path: None
      Compute compatible: True
      Energy compatible: False
KeithMyers commented 4 years ago

You can use this nvidia-smi command for a polling power usage on a card. nvidia-smi stats -i <device#> -d pwrDraw And this is the output for my device 0. 0, pwrDraw , 1579539474276611, 175 The large number is memory usage and the 175 is the wattage.

Or for a snaphshot of a single card. nvidia-smi -i 0 --query-gpu=power.draw --format=csv power.draw [W] 207.88 W

If you want all the cards at once: nvidia-smi --query-gpu=power.draw --format=csv power.draw [W] 205.83 W 210.04 W 57.35 W

Both CUDA and OpenCL API are included in the standard Nvidia drivers. Sometimes the OpenCL API is dropped from packages but can always be installed separately if needed. sudo apt-get install ocl-icd-libopencl1

KeithMyers commented 4 years ago

Keith: how did you get that info?

That was what comes up when I ran the test benchMT in the Terminal.

Ricks-Lab commented 4 years ago

@JStateson The empty brace at the top of your last posted output indicates that openCL is not installed. This means checking for openCL to judge compute capability is not going to work. Probably need to find another way to detect cuda capability.

@KeithMyers @JStateson I have attempted an implementation of energy metrics for Nvidia. Can you run 'benchMT --lsgpu' and post your output here? I need to learn to interpret the output first. I have also included a power read of a bad card number meant to be an error so I know how to manage it.

Thanks!

KeithMyers commented 4 years ago

Something isn't correct in the code.

 ./benchMT --lsgpu
{}
nsmi_items: [['42.92', '']]
Traceback (most recent call last):
  File "./benchMT", line 2516, in <module>
    main()
  File "./benchMT", line 2166, in main
    gpu_list.set_gpu_list()
  File "./benchMT", line 360, in set_gpu_list
    return self.set_lspci_gpu_list()
  File "./benchMT", line 587, in set_lspci_gpu_list
    mb_const.cmd_nvidia_smi, '9'), shell=True).decode().split('\n')
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '/usr/bin/nvidia-smi -i 9 --query-gpu=power.draw --format=csv,noheader,nounits' returned non-zero exit status 6.
KeithMyers commented 4 years ago

I see you hard coded for gpu #9. I don't have nine gpus. Only 3 on this test machine. Oh, read the rest of your post. Intended. Nevermind.

Ricks-Lab commented 4 years ago

Thanks Keith! This is exactly what I need. I should be able to complete the implementation tomorrow. Trying to get a beta release out there before the Lunar New Year holidays.

Ricks-Lab commented 4 years ago

@JStateson @KeithMyers I have posted a fully functional, but untested for Nvidia version of benchMT. Can you try with ‘--lsgpu’ option?

Next step is to test it running benchmarks, but a devmap must be specified on the command line, since I don’t know how to determine the association of BOINC device and card number yet. On all of my systems, the mapping is always reversed order. It would be good to get the mapping on other systems to see if this correlation holds up.

KeithMyers commented 4 years ago

The output of lsgpu looks correct as far as BusID assignments.

 ./benchMT --lsgpu
benchMT workdir Path [ /home/keith/Downloads/benchMT-master/workdir/ ] does not exist, making...
TestData Path [ /home/keith/Downloads/benchMT-master/testData/ ] does not exist, making...
{}
GPU_ITEM: uuid: f362d887959541bc867d846b3e7b9389
    pcie_id: 08:00.0
    model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 0
    BOINC Device number: -1
    card path: /sys/class/drm/card0/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True
GPU_ITEM: uuid: 62f968aecbaa4a409a7f0f99ca21321b
    pcie_id: 0a:00.0
    model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 1
    BOINC Device number: -1
    card path: /sys/class/drm/card1/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True
GPU_ITEM: uuid: 5b3e43038d7c4e4395f6f9b488cefc91
    pcie_id: 0b:00.0
    model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 2
    BOINC Device number: -1
    card path: /sys/class/drm/card2/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True

Need to actually run some test WU's against the cards after I change out the benchMT executable.

KeithMyers commented 4 years ago

So what is the format of the dev map?

Ricks-Lab commented 4 years ago

So what is the format of the dev map?

I’m not at my computer, so from memory it is: --devmap b:c,b:c,b:c

Where b represents BOINC device number and c represents card number for each GPU.

Ricks-Lab commented 4 years ago

I just tested all three of my systems and found that the correlation between boinc dev and card num is not consistent. Here are the mappings I found:

Eos (2 Vega64 in abnormal slots, 1st not used): 0:1,1:0 
Nexon (1 Vega20 and 1 server GPU with no compute): 0:1
Rhea (4 Fiji Nano): 0:0,1:1,2:2,3:3

The best solution for mapping may be to manually determine and store in the BenchCFG. Maybe if we could ever find the relevant boinc or MB app code, we could figure it out.

KeithMyers commented 4 years ago

I'm positive the code that labels the BOINC gpu number is in gpu_detect.cpp.

//  These libraries sometimes crash,
//  and we've been unable to trap these via signal and exception handlers.
//  So we do GPU detection in a separate process (boinc --detect_gpus)
//  This process writes an XML file "coproc_info.xml" containing
//  - lists of GPU detected via CUDA and CAL
//  - lists of nvidia/amd/intel GPUs detected via OpenCL
//  - a list of other GPUs detected via OpenCL
//
//  When the process finishes, the client parses the info file.
//  Then for each vendor it "correlates" the GPUs, which includes:
//  - matching up the OpenCL and vendor-specific descriptions, if both exist
//  - finding the most capable GPU, and seeing which other GPUs
//      are similar to it in hardware and RAM.
//      Other GPUs are not used.
//  - copy these to the COPROCS structure
 COPROC c;
        // For device types other than NVIDIA, ATI or Intel GPU.
        // we put each instance into a separate other_opencls element,
        // so count=1.
        //
        c.count = 1;
        c.opencl_device_count = 1;
        c.opencl_prop = other_opencls[i];
        c.available_ram = c.opencl_prop.global_mem_size;
        c.device_num = c.opencl_prop.device_num;
        c.peak_flops = c.opencl_prop.peak_flops;
        c.have_opencl = true;
        c.opencl_device_indexes[0] = c.opencl_prop.opencl_device_index;
        c.opencl_device_ids[0] = c.opencl_prop.device_id;
        c.instance_has_opencl[0] = true;
        c.clear_usage();
        safe_strcpy(c.type, other_opencls[i].name);

Coproc_info.xml contains the list of detected gpus in the host.

KeithMyers commented 4 years ago

./benchMT --lsgpu --devmap 0:0,1:1,2:2 {} GPU_ITEM: uuid: 729c5147b42a456e9da0e87bd3da053a pcie_id: 08:00.0 model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia openCL Device: None openCL Version: None card number: 0 BOINC Device number: -1 card path: /sys/class/drm/card0/device hwmon path: None Compute compatible: True Energy compatible: True GPU_ITEM: uuid: 886bf6b5f8df41528f8c1abb4061436c pcie_id: 0a:00.0 model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia openCL Device: None openCL Version: None card number: 1 BOINC Device number: -1 card path: /sys/class/drm/card1/device hwmon path: None Compute compatible: True Energy compatible: True GPU_ITEM: uuid: 6f4990a868324600ade9505717b9f628 pcie_id: 0b:00.0 model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1) vendor: NVIDIA driver: nvidiafb, nouveau, nvidia_drm, nvidia openCL Device: None openCL Version: None card number: 2 BOINC Device number: -1 card path: /sys/class/drm/card2/device hwmon path: None Compute compatible: True Energy compatible: True

KeithMyers commented 4 years ago

But the GPU UUID is not even close to what Nvidia X Server Settings lists for the card UUID's Doesn't correlate at all.

Ricks-Lab commented 4 years ago

But the GPU UUID is not even close to what Nvidia X Server Settings lists for the card UUID's Doesn't correlate at all.

The uuid used by benchMT is generated by benchMT to assure it is unique and used as a key in the dictionary of GPUs.

Ricks-Lab commented 4 years ago

I just checked the coproc_info.xml on my system, and I don't see anything that links the device number to the card number or pcid ID.

   <ati_opencl>
      <name>Vega 20</name>
      <vendor>Advanced Micro Devices, Inc.</vendor>
      <vendor_id>4098</vendor_id>
      <available>1</available>
      <half_fp_config>0</half_fp_config>
      <single_fp_config>191</single_fp_config>
      <double_fp_config>63</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program </extensions>
      <global_mem_size>17163091968</global_mem_size>
      <local_mem_size>65536</local_mem_size>
      <max_clock_frequency>1801</max_clock_frequency>
      <max_compute_units>60</max_compute_units>
      <nv_compute_capability_major>0</nv_compute_capability_major>
      <nv_compute_capability_minor>0</nv_compute_capability_minor>
      <amd_simd_per_compute_unit>4</amd_simd_per_compute_unit>
      <amd_simd_width>16</amd_simd_width>
      <amd_simd_instruction_width>1</amd_simd_instruction_width>
      <opencl_platform_version>OpenCL 2.1 AMD-APP (3052.0)</opencl_platform_version>
      <opencl_device_version>OpenCL 2.0 </opencl_device_version>
      <opencl_driver_version>3052.0 (HSA1.1,LC)</opencl_driver_version>
      <device_num>0</device_num>
      <peak_flops>13831680000000.000000</peak_flops>
      <opencl_available_ram>17163091968.000000</opencl_available_ram>
      <opencl_device_index>0</opencl_device_index>
      <warn_bad_cuda>0</warn_bad_cuda>
   </ati_opencl>
<warning>NVIDIA: libcuda.so: cannot open shared object file: No such file or directory</warning>
<warning>ATI: libaticalrt.so: cannot open shared object file: No such file or directory</warning>
    </coprocs>
KeithMyers commented 4 years ago

The BOINC enumeration of cards is by capability. That is why I included the snippet of code from gpu_detect.cpp. That piece of code is how BOINC rates the capabilities of the installed gpus. The best crunching card is always going to be gpu:0, then the second best is gpu:1 and so on.

I'm not sure if that is the exact code however. Somewhere Richard stated the order of capability was Compute Capability, Memory, GFLOPS rating. But that can get skewed from reality when a card from the previous generation and lesser CC has a much higher GLOPS rating compared to the most recent cards of higher CC capability but much less GFLOPS rating because it is at the bottom of the product stack.

Ricks-Lab commented 4 years ago

The BOINC enumeration of cards is by capability. That is why I included the snippet of code from gpu_detect.cpp. That piece of code is how BOINC rates the capabilities of the installed gpus. The best crunching card is always going to be gpu:0, then the second best is gpu:1 and so on.

I'm not sure if that is the exact code however. Somewhere Richard stated the order of capability was Compute Capability, Memory, GFLOPS rating. But that can get skewed from reality when a card from the previous generation and lesser CC has a much higher GLOPS rating compared to the most recent cards of higher CC capability but much less GFLOPS rating because it is at the bottom of the product stack.

Can you send a link to where the file is? I assume this is part of the boinc client code? I am concerned that even if we know that's what they do, if there is no linkage to card number or pcie ID, then there may be no reliable way to automate it.

KeithMyers commented 4 years ago

It is scattered in all of the gpu_x.cpp files as in gpu_intel.cpp, gpu_nvidia.cpp, gpu_opencl.cpp. The code for gpu_nvidia.cpp is here: []https://github.com/BOINC/boinc/blob/master/client/gpu_nvidia.cpp#L136 // return 1/-1/0 if device 1 is more/less/same capable than device 2. // factors (decreasing priority): // - compute capability // - software version // - available memory // - speed

Ricks-Lab commented 4 years ago

@JStateson @KeithMyers I have a beta version of this major rewrite on master. I have tested on my systems and all looks good, but I have no nvidia GPUs, so I really need your help to test it out.

The main new feature is energy calculation for nvidia, but to use it, you need a devmap. As discussed in this thread, we have not yet found a way to automate the mapping, but it is easy to generate manually. Here is my processes:

  1. Make sure one MB application is un-commented in benchMT CFG file.
  2. Copy one of the std_signals WU units into the WU_test directory, make sure it is the only wu file in that directory.
  3. Start a gpu monitor program and monitor loading of the cards.
  4. run bencMT --gpu_device 0, and observe which card number has loading.
  5. Repeat for all gpu_device values.
  6. Enter the mapping in the CFG file. There is an example for reference.

After the mapping is complete, run benchMT with the --energy option and post your results here. I suggest removing the std_signal WU from the WU_test directory and replace with one from the WU_test/safe directory.

I suggest running with --debug so you can see the power and energy readings and maybe compare with nvidia-smi results.

JStateson commented 4 years ago

Been busy working on my boinctasks temperature projects. I wanted to show wattage used as an option as a first step into trying to identify which card on the motherboard corresponds to a problem showing up on the boinctasks display. Anyway, got the following informatoin that might be helpful:

  1. output from nvidia-smi: gpus number 1..9 no sorting with bus id
  2. output from clinfo: looks like the sorting is done in clinfo as the output matches what is shown when boinc boots up. I had thought that boinc did the sorting but it seems clinfo does error = just looked at another system and clinfo did not match the boinc order
  3. output from boinc

What I hope to accomplish: Be able to look at a display of a problem work unit as shown by boinctasks for any remote system running Linux, see the temps and wattage, be able to identify the card and, somehow, if the cards have identical names find some identifying serial number or code programmatically so as to know which board has the problem. I have my own version of boinc "MSboinc" that I can build for window or Linux easily and will be adding this tools to it and, eventually, to my boinctasks history reader. I want to eventually replace performance reports that have "d3" with something meaningful like "gtx-1660Ti"

`jstateson@h110btc:~/Projects/BoincTasks$ nvidia-smi -L

GPU 0: GeForce GTX 1060 6GB (UUID: GPU-a2089043-23bd-3481-efb2-f3cbbce5906a) GPU 1: GeForce GTX 1060 3GB (UUID: GPU-4b22e301-6b76-c4ad-f962-b40d3060dd20) GPU 2: GeForce GTX 1060 3GB (UUID: GPU-c8eb4c00-ec6c-01de-d198-262fe9b93cb7) GPU 3: P106-100 (UUID: GPU-df40a4fd-908f-f7cf-13aa-2654367aef88) GPU 4: GeForce GTX 1070 (UUID: GPU-f3d9b16e-7878-e14b-3e43-71a37769e93a) GPU 5: P106-090 (UUID: GPU-69fe3af3-2dfb-9cb1-3fc6-5650f3953f16) GPU 6: GeForce GTX 1060 3GB (UUID: GPU-6c5723e9-6e00-dbd5-c890-ce278a20661e) GPU 7: GeForce GTX 1060 3GB (UUID: GPU-a6c40c5f-d334-766a-6fa2-acfb8e572e88) GPU 8: GeForce GTX 1060 3GB (UUID: GPU-d1137596-9cfa-a466-dcbb-0583a095bcdf)

jstateson@h110btc:~/Projects$ clinfo | grep "Device Topology" Device Topology (NV) PCI-E, 05:00.0 Device Topology (NV) PCI-E, 01:00.0 Device Topology (NV) PCI-E, 04:00.0 Device Topology (NV) PCI-E, 02:00.0 Device Topology (NV) PCI-E, 03:00.0 Device Topology (NV) PCI-E, 0a:00.0 Device Topology (NV) PCI-E, 0b:00.0 Device Topology (NV) PCI-E, 0e:00.0 Device Topology (NV) PCI-E, 08:00.0

jstateson@h110btc:~/Projects$ clinfo | grep "Device Name" Device Name GeForce GTX 1070 Device Name GeForce GTX 1060 6GB Device Name P106-100 Device Name GeForce GTX 1060 3GB Device Name GeForce GTX 1060 3GB Device Name GeForce GTX 1060 3GB Device Name GeForce GTX 1060 3GB Device Name GeForce GTX 1060 3GB Device Name P106-090 Device Name Intel(R) Gen9 HD Graphics NEO

jstateson@h110btc:~/Projects/BoincTasks/SystemdService$ nvidia-smi -q -d PIDS | grep "GPU 0" GPU 00000000:01:00.0 GPU 00000000:02:00.0 GPU 00000000:03:00.0 GPU 00000000:04:00.0 GPU 00000000:05:00.0 GPU 00000000:08:00.0 GPU 00000000:0A:00.0 GPU 00000000:0B:00.0 GPU 00000000:0E:00.0

============below from boinc event messages h110btc 5 CUDA: NVIDIA GPU 0: GeForce GTX 1070 (driver version 440.48, CUDA version 10.2, compute capability 6.1, 4096MB, 3972MB available, 6561 GFLOPS peak)
6 CUDA: NVIDIA GPU 1: GeForce GTX 1060 6GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 4096MB, 3974MB available, 4698 GFLOPS peak)
7 CUDA: NVIDIA GPU 2: P106-100 (driver version 440.48, CUDA version 10.2, compute capability 6.1, 4096MB, 3974MB available, 4374 GFLOPS peak)
8 CUDA: NVIDIA GPU 3: GeForce GTX 1060 3GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3019MB, 2943MB available, 3936 GFLOPS peak)
9 CUDA: NVIDIA GPU 4: GeForce GTX 1060 3GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3019MB, 2943MB available, 3936 GFLOPS peak)
10 CUDA: NVIDIA GPU 5: GeForce GTX 1060 3GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3019MB, 2943MB available, 3936 GFLOPS peak)
11 CUDA: NVIDIA GPU 6: GeForce GTX 1060 3GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3019MB, 2943MB available, 3936 GFLOPS peak)
12 CUDA: NVIDIA GPU 7: GeForce GTX 1060 3GB (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3019MB, 2943MB available, 3936 GFLOPS peak)
13 CUDA: NVIDIA GPU 8: P106-090 (driver version 440.48, CUDA version 10.2, compute capability 6.1, 3022MB, 2965MB available, 1960 GFLOPS peak)
14 OpenCL: NVIDIA GPU 0: GeForce GTX 1070 (driver version 440.48.02, device version OpenCL 1.2 CUDA, 8120MB, 3972MB available, 6561 GFLOPS peak)
15 OpenCL: NVIDIA GPU 1: GeForce GTX 1060 6GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 6078MB, 3974MB available, 4698 GFLOPS peak)
16 OpenCL: NVIDIA GPU 2: P106-100 (driver version 440.48.02, device version OpenCL 1.2 CUDA, 6081MB, 3974MB available, 4374 GFLOPS peak)
17 OpenCL: NVIDIA GPU 3: GeForce GTX 1060 3GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3019MB, 2943MB available, 3936 GFLOPS peak)
18 OpenCL: NVIDIA GPU 4: GeForce GTX 1060 3GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3019MB, 2943MB available, 3936 GFLOPS peak)
19 OpenCL: NVIDIA GPU 5: GeForce GTX 1060 3GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3019MB, 2943MB available, 3936 GFLOPS peak)
20 OpenCL: NVIDIA GPU 6: GeForce GTX 1060 3GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3019MB, 2943MB available, 3936 GFLOPS peak)
21 OpenCL: NVIDIA GPU 7: GeForce GTX 1060 3GB (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3019MB, 2943MB available, 3936 GFLOPS peak)
22 OpenCL: NVIDIA GPU 8: P106-090 (driver version 440.48.02, device version OpenCL 1.2 CUDA, 3022MB, 2965MB available, 1960 GFLOPS peak)
23 OpenCL: Intel GPU 0: Intel(R) Gen9 HD Graphics NEO (driver version 19.45.14764, device version OpenCL 2.1 NEO, 25449MB, 25449MB available, 221 GFLOPS peak)
`

from copro_cinfo and note that it matches clinfo

jstateson@h110btc:/var/lib/boinc$ grep -i "bus" coproc_info.xml
   <bus_id>5</bus_id>
   <bus_id>1</bus_id>
   <bus_id>4</bus_id>
   <bus_id>2</bus_id>
   <bus_id>3</bus_id>
   <bus_id>10</bus_id>
   <bus_id>11</bus_id>
   <bus_id>14</bus_id>
   <bus_id>8</bus_id>\

from clinfo on a tb85 system

jstateson@tb85-nvidia:~$ clinfo | grep "Device Name"
  Device Name                                     Intel(R) Xeon(R) CPU E3-1230 v    3 @ 3.30GHz
  Device Name                                     GeForce GTX 1070
  Device Name                                     GeForce GTX 1660 Ti
  Device Name                                     P102-100
  Device Name                                     P102-100
  Device Name                                     P102-100
  Device Name                                     GeForce GTX 1070 Ti

boinc show a different ordering, the 1660 then the Ti, then the P102 then that 1070
Ricks-Lab commented 4 years ago

from clinfo https://stateson.net/images/tb85_clinfo.txt

@JStateson I noticed that this file has information on nvidia compute platform, but my script is not picking it up. Can you post the output of clinfo --raw?

Ricks-Lab commented 4 years ago

@JStateson @KeithMyers I just made a change to benchMT to hopefully pickup Nvidia GPUs based on what I think clinfo --raw would look like. Can you check with benchMT --lsgpu

KeithMyers commented 4 years ago
keith@Serenity:~/Downloads/benchMT-master$ ./benchMT --lsgpu
benchMT workdir Path [ /home/keith/Downloads/benchMT-master/workdir/ ] does not exist, making...
TestData Path [ /home/keith/Downloads/benchMT-master/testData/ ] does not exist, making...
GPU_ITEM: uuid: 627ddcf8f20443faa5d215bc8473e5cc
    pcie_id: 08:00.0
    model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 0
    BOINC Device number: None
    card path: /sys/class/drm/card0/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True
GPU_ITEM: uuid: 8343a2ba785a4dfebaaf628530f73435
    pcie_id: 0a:00.0
    model: NVIDIA Corporation TU104 [GeForce RTX 2080 Rev. A] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 1
    BOINC Device number: None
    card path: /sys/class/drm/card1/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True
GPU_ITEM: uuid: d50e742a546c427793cdf3f774c3675c
    pcie_id: 0b:00.0
    model: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
    vendor: NVIDIA
    driver: nvidiafb, nouveau, nvidia_drm, nvidia
    openCL Device: None
    openCL Version: None
    card number: 2
    BOINC Device number: None
    card path: /sys/class/drm/card2/device
    hwmon path: None
    Compute compatible: True
    Energy compatible: True
keith@Serenity:~/Downloads/benchMT-master$ 
Ricks-Lab commented 4 years ago

@KeithMyers Was this with the current benchMT on master? I made a change a few hours ago.

Also, can you post the output of clinfo --raw?

KeithMyers commented 4 years ago

Yes, this was with the new master updated ten minutes ago.

keith@Serenity:~$ clinfo --raw
#PLATFORMS                                        1
  CL_PLATFORM_NAME                                NVIDIA CUDA
  CL_PLATFORM_VENDOR                              NVIDIA Corporation
  CL_PLATFORM_VERSION                             OpenCL 1.2 CUDA 10.2.115
  CL_PLATFORM_PROFILE                             FULL_PROFILE
  CL_PLATFORM_EXTENSIONS                          cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
  CL_PLATFORM_ICD_SUFFIX_KHR                      NV

[NV/*]   CL_PLATFORM_NAME                                NVIDIA CUDA
[NV/*] #DEVICES                                          3
[NV/0]   CL_DEVICE_NAME                                  GeForce RTX 2080
[NV/0]   CL_DEVICE_VENDOR                                NVIDIA Corporation
[NV/0]   CL_DEVICE_VENDOR_ID                             0x10de
[NV/0]   CL_DEVICE_VERSION                               OpenCL 1.2 CUDA
[NV/0]   CL_DRIVER_VERSION                               440.48.02
[NV/0]   CL_DEVICE_OPENCL_C_VERSION                      OpenCL C 1.2 
[NV/0]   CL_DEVICE_TYPE                                  CL_DEVICE_TYPE_GPU
[NV/0]   CL_DEVICE_PCI_BUS_ID_NV                         8
[NV/0]   CL_DEVICE_PCI_SLOT_ID_NV                        0
[NV/0]   CL_DEVICE_PROFILE                               FULL_PROFILE
[NV/0]   CL_DEVICE_AVAILABLE                             CL_TRUE
[NV/0]   CL_DEVICE_COMPILER_AVAILABLE                    CL_TRUE
[NV/0]   CL_DEVICE_LINKER_AVAILABLE                      CL_TRUE
[NV/0]   CL_DEVICE_MAX_COMPUTE_UNITS                     46
[NV/0]   CL_DEVICE_MAX_CLOCK_FREQUENCY                   1800
[NV/0]   CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV           7
[NV/0]   CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV           5
[NV/0]   CL_DEVICE_PARTITION_MAX_SUB_DEVICES             1
[NV/0]   CL_DEVICE_PARTITION_PROPERTIES                  CL_NONE
[NV/0]   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS              3
[NV/0]   CL_DEVICE_MAX_WORK_ITEM_SIZES                   1024 1024 64
[NV/0]   CL_DEVICE_MAX_WORK_GROUP_SIZE                   1024
[NV/0]   CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE    32
[NV/0]   CL_DEVICE_WARP_SIZE_NV                          32
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR           1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR              1
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT          1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT             1
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT            1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_INT               1
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG           1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG              1
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF           0
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF              0
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT          1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT             1
[NV/0]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE         1
[NV/0]   CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE            1
[NV/0]   CL_DEVICE_SINGLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA | CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT
[NV/0]   CL_DEVICE_DOUBLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA
[NV/0]   CL_DEVICE_ADDRESS_BITS                          64
[NV/0]   CL_DEVICE_ENDIAN_LITTLE                         CL_TRUE
[NV/0]   CL_DEVICE_GLOBAL_MEM_SIZE                       8370061312
[NV/0]   CL_DEVICE_ERROR_CORRECTION_SUPPORT              CL_FALSE
[NV/0]   CL_DEVICE_MAX_MEM_ALLOC_SIZE                    2092515328
[NV/0]   CL_DEVICE_HOST_UNIFIED_MEMORY                   CL_FALSE
[NV/0]   CL_DEVICE_INTEGRATED_MEMORY_NV                  CL_FALSE
[NV/0]   CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE              128
[NV/0]   CL_DEVICE_MEM_BASE_ADDR_ALIGN                   4096
[NV/0]   CL_DEVICE_GLOBAL_MEM_CACHE_TYPE                 CL_READ_WRITE_CACHE
[NV/0]   CL_DEVICE_GLOBAL_MEM_CACHE_SIZE                 1507328
[NV/0]   CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE             128
[NV/0]   CL_DEVICE_IMAGE_SUPPORT                         CL_TRUE
[NV/0]   CL_DEVICE_MAX_SAMPLERS                          32
[NV/0]   CL_DEVICE_IMAGE_MAX_BUFFER_SIZE                 268435456
[NV/0]   CL_DEVICE_IMAGE_MAX_ARRAY_SIZE                  2048
[NV/0]   CL_DEVICE_IMAGE2D_MAX_HEIGHT                    32768
[NV/0]   CL_DEVICE_IMAGE2D_MAX_WIDTH                     32768
[NV/0]   CL_DEVICE_IMAGE3D_MAX_HEIGHT                    16384
[NV/0]   CL_DEVICE_IMAGE3D_MAX_WIDTH                     16384
[NV/0]   CL_DEVICE_IMAGE3D_MAX_DEPTH                     16384
[NV/0]   CL_DEVICE_MAX_READ_IMAGE_ARGS                   256
[NV/0]   CL_DEVICE_MAX_WRITE_IMAGE_ARGS                  32
[NV/0]   CL_DEVICE_LOCAL_MEM_TYPE                        CL_LOCAL
[NV/0]   CL_DEVICE_LOCAL_MEM_SIZE                        49152
[NV/0]   CL_DEVICE_REGISTERS_PER_BLOCK_NV                65536
[NV/0]   CL_DEVICE_MAX_CONSTANT_ARGS                     9
[NV/0]   CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE              65536
[NV/0]   CL_DEVICE_MAX_PARAMETER_SIZE                    4352
[NV/0]   CL_DEVICE_QUEUE_PROPERTIES                      CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE
[NV/0]   CL_DEVICE_PREFERRED_INTEROP_USER_SYNC           CL_FALSE
[NV/0]   CL_DEVICE_PROFILING_TIMER_RESOLUTION            1000
[NV/0]   CL_DEVICE_EXECUTION_CAPABILITIES                CL_EXEC_KERNEL
[NV/0]   CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV                CL_TRUE
[NV/0]   CL_DEVICE_GPU_OVERLAP_NV                        CL_TRUE
[NV/0]   CL_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT_NV       3
[NV/0]   CL_DEVICE_PRINTF_BUFFER_SIZE                    1048576
[NV/0]   CL_DEVICE_BUILT_IN_KERNELS                      
[NV/0]   CL_DEVICE_EXTENSIONS                            cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

[NV/1]   CL_DEVICE_NAME                                  GeForce RTX 2080
[NV/1]   CL_DEVICE_VENDOR                                NVIDIA Corporation
[NV/1]   CL_DEVICE_VENDOR_ID                             0x10de
[NV/1]   CL_DEVICE_VERSION                               OpenCL 1.2 CUDA
[NV/1]   CL_DRIVER_VERSION                               440.48.02
[NV/1]   CL_DEVICE_OPENCL_C_VERSION                      OpenCL C 1.2 
[NV/1]   CL_DEVICE_TYPE                                  CL_DEVICE_TYPE_GPU
[NV/1]   CL_DEVICE_PCI_BUS_ID_NV                         10
[NV/1]   CL_DEVICE_PCI_SLOT_ID_NV                        0
[NV/1]   CL_DEVICE_PROFILE                               FULL_PROFILE
[NV/1]   CL_DEVICE_AVAILABLE                             CL_TRUE
[NV/1]   CL_DEVICE_COMPILER_AVAILABLE                    CL_TRUE
[NV/1]   CL_DEVICE_LINKER_AVAILABLE                      CL_TRUE
[NV/1]   CL_DEVICE_MAX_COMPUTE_UNITS                     46
[NV/1]   CL_DEVICE_MAX_CLOCK_FREQUENCY                   1800
[NV/1]   CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV           7
[NV/1]   CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV           5
[NV/1]   CL_DEVICE_PARTITION_MAX_SUB_DEVICES             1
[NV/1]   CL_DEVICE_PARTITION_PROPERTIES                  CL_NONE
[NV/1]   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS              3
[NV/1]   CL_DEVICE_MAX_WORK_ITEM_SIZES                   1024 1024 64
[NV/1]   CL_DEVICE_MAX_WORK_GROUP_SIZE                   1024
[NV/1]   CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE    32
[NV/1]   CL_DEVICE_WARP_SIZE_NV                          32
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR           1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR              1
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT          1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT             1
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT            1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_INT               1
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG           1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG              1
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF           0
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF              0
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT          1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT             1
[NV/1]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE         1
[NV/1]   CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE            1
[NV/1]   CL_DEVICE_SINGLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA | CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT
[NV/1]   CL_DEVICE_DOUBLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA
[NV/1]   CL_DEVICE_ADDRESS_BITS                          64
[NV/1]   CL_DEVICE_ENDIAN_LITTLE                         CL_TRUE
[NV/1]   CL_DEVICE_GLOBAL_MEM_SIZE                       8366784512
[NV/1]   CL_DEVICE_ERROR_CORRECTION_SUPPORT              CL_FALSE
[NV/1]   CL_DEVICE_MAX_MEM_ALLOC_SIZE                    2091696128
[NV/1]   CL_DEVICE_HOST_UNIFIED_MEMORY                   CL_FALSE
[NV/1]   CL_DEVICE_INTEGRATED_MEMORY_NV                  CL_FALSE
[NV/1]   CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE              128
[NV/1]   CL_DEVICE_MEM_BASE_ADDR_ALIGN                   4096
[NV/1]   CL_DEVICE_GLOBAL_MEM_CACHE_TYPE                 CL_READ_WRITE_CACHE
[NV/1]   CL_DEVICE_GLOBAL_MEM_CACHE_SIZE                 1507328
[NV/1]   CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE             128
[NV/1]   CL_DEVICE_IMAGE_SUPPORT                         CL_TRUE
[NV/1]   CL_DEVICE_MAX_SAMPLERS                          32
[NV/1]   CL_DEVICE_IMAGE_MAX_BUFFER_SIZE                 268435456
[NV/1]   CL_DEVICE_IMAGE_MAX_ARRAY_SIZE                  2048
[NV/1]   CL_DEVICE_IMAGE2D_MAX_HEIGHT                    32768
[NV/1]   CL_DEVICE_IMAGE2D_MAX_WIDTH                     32768
[NV/1]   CL_DEVICE_IMAGE3D_MAX_HEIGHT                    16384
[NV/1]   CL_DEVICE_IMAGE3D_MAX_WIDTH                     16384
[NV/1]   CL_DEVICE_IMAGE3D_MAX_DEPTH                     16384
[NV/1]   CL_DEVICE_MAX_READ_IMAGE_ARGS                   256
[NV/1]   CL_DEVICE_MAX_WRITE_IMAGE_ARGS                  32
[NV/1]   CL_DEVICE_LOCAL_MEM_TYPE                        CL_LOCAL
[NV/1]   CL_DEVICE_LOCAL_MEM_SIZE                        49152
[NV/1]   CL_DEVICE_REGISTERS_PER_BLOCK_NV                65536
[NV/1]   CL_DEVICE_MAX_CONSTANT_ARGS                     9
[NV/1]   CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE              65536
[NV/1]   CL_DEVICE_MAX_PARAMETER_SIZE                    4352
[NV/1]   CL_DEVICE_QUEUE_PROPERTIES                      CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE
[NV/1]   CL_DEVICE_PREFERRED_INTEROP_USER_SYNC           CL_FALSE
[NV/1]   CL_DEVICE_PROFILING_TIMER_RESOLUTION            1000
[NV/1]   CL_DEVICE_EXECUTION_CAPABILITIES                CL_EXEC_KERNEL
[NV/1]   CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV                CL_TRUE
[NV/1]   CL_DEVICE_GPU_OVERLAP_NV                        CL_TRUE
[NV/1]   CL_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT_NV       3
[NV/1]   CL_DEVICE_PRINTF_BUFFER_SIZE                    1048576
[NV/1]   CL_DEVICE_BUILT_IN_KERNELS                      
[NV/1]   CL_DEVICE_EXTENSIONS                            cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

[NV/2]   CL_DEVICE_NAME                                  GeForce GTX 1080
[NV/2]   CL_DEVICE_VENDOR                                NVIDIA Corporation
[NV/2]   CL_DEVICE_VENDOR_ID                             0x10de
[NV/2]   CL_DEVICE_VERSION                               OpenCL 1.2 CUDA
[NV/2]   CL_DRIVER_VERSION                               440.48.02
[NV/2]   CL_DEVICE_OPENCL_C_VERSION                      OpenCL C 1.2 
[NV/2]   CL_DEVICE_TYPE                                  CL_DEVICE_TYPE_GPU
[NV/2]   CL_DEVICE_PCI_BUS_ID_NV                         11
[NV/2]   CL_DEVICE_PCI_SLOT_ID_NV                        0
[NV/2]   CL_DEVICE_PROFILE                               FULL_PROFILE
[NV/2]   CL_DEVICE_AVAILABLE                             CL_TRUE
[NV/2]   CL_DEVICE_COMPILER_AVAILABLE                    CL_TRUE
[NV/2]   CL_DEVICE_LINKER_AVAILABLE                      CL_TRUE
[NV/2]   CL_DEVICE_MAX_COMPUTE_UNITS                     20
[NV/2]   CL_DEVICE_MAX_CLOCK_FREQUENCY                   1860
[NV/2]   CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV           6
[NV/2]   CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV           1
[NV/2]   CL_DEVICE_PARTITION_MAX_SUB_DEVICES             1
[NV/2]   CL_DEVICE_PARTITION_PROPERTIES                  CL_NONE
[NV/2]   CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS              3
[NV/2]   CL_DEVICE_MAX_WORK_ITEM_SIZES                   1024 1024 64
[NV/2]   CL_DEVICE_MAX_WORK_GROUP_SIZE                   1024
[NV/2]   CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE    32
[NV/2]   CL_DEVICE_WARP_SIZE_NV                          32
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR           1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR              1
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT          1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT             1
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT            1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_INT               1
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG           1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG              1
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF           0
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF              0
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT          1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT             1
[NV/2]   CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE         1
[NV/2]   CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE            1
[NV/2]   CL_DEVICE_SINGLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA | CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT
[NV/2]   CL_DEVICE_DOUBLE_FP_CONFIG                      CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST | CL_FP_ROUND_TO_ZERO | CL_FP_ROUND_TO_INF | CL_FP_FMA
[NV/2]   CL_DEVICE_ADDRESS_BITS                          64
[NV/2]   CL_DEVICE_ENDIAN_LITTLE                         CL_TRUE
[NV/2]   CL_DEVICE_GLOBAL_MEM_SIZE                       8513978368
[NV/2]   CL_DEVICE_ERROR_CORRECTION_SUPPORT              CL_FALSE
[NV/2]   CL_DEVICE_MAX_MEM_ALLOC_SIZE                    2128494592
[NV/2]   CL_DEVICE_HOST_UNIFIED_MEMORY                   CL_FALSE
[NV/2]   CL_DEVICE_INTEGRATED_MEMORY_NV                  CL_FALSE
[NV/2]   CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE              128
[NV/2]   CL_DEVICE_MEM_BASE_ADDR_ALIGN                   4096
[NV/2]   CL_DEVICE_GLOBAL_MEM_CACHE_TYPE                 CL_READ_WRITE_CACHE
[NV/2]   CL_DEVICE_GLOBAL_MEM_CACHE_SIZE                 983040
[NV/2]   CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE             128
[NV/2]   CL_DEVICE_IMAGE_SUPPORT                         CL_TRUE
[NV/2]   CL_DEVICE_MAX_SAMPLERS                          32
[NV/2]   CL_DEVICE_IMAGE_MAX_BUFFER_SIZE                 268435456
[NV/2]   CL_DEVICE_IMAGE_MAX_ARRAY_SIZE                  2048
[NV/2]   CL_DEVICE_IMAGE2D_MAX_HEIGHT                    32768
[NV/2]   CL_DEVICE_IMAGE2D_MAX_WIDTH                     16384
[NV/2]   CL_DEVICE_IMAGE3D_MAX_HEIGHT                    16384
[NV/2]   CL_DEVICE_IMAGE3D_MAX_WIDTH                     16384
[NV/2]   CL_DEVICE_IMAGE3D_MAX_DEPTH                     16384
[NV/2]   CL_DEVICE_MAX_READ_IMAGE_ARGS                   256
[NV/2]   CL_DEVICE_MAX_WRITE_IMAGE_ARGS                  16
[NV/2]   CL_DEVICE_LOCAL_MEM_TYPE                        CL_LOCAL
[NV/2]   CL_DEVICE_LOCAL_MEM_SIZE                        49152
[NV/2]   CL_DEVICE_REGISTERS_PER_BLOCK_NV                65536
[NV/2]   CL_DEVICE_MAX_CONSTANT_ARGS                     9
[NV/2]   CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE              65536
[NV/2]   CL_DEVICE_MAX_PARAMETER_SIZE                    4352
[NV/2]   CL_DEVICE_QUEUE_PROPERTIES                      CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE
[NV/2]   CL_DEVICE_PREFERRED_INTEROP_USER_SYNC           CL_FALSE
[NV/2]   CL_DEVICE_PROFILING_TIMER_RESOLUTION            1000
[NV/2]   CL_DEVICE_EXECUTION_CAPABILITIES                CL_EXEC_KERNEL
[NV/2]   CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV                CL_TRUE
[NV/2]   CL_DEVICE_GPU_OVERLAP_NV                        CL_TRUE
[NV/2]   CL_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT_NV       2
[NV/2]   CL_DEVICE_PRINTF_BUFFER_SIZE                    1048576
[NV/2]   CL_DEVICE_BUILT_IN_KERNELS                      
[NV/2]   CL_DEVICE_EXTENSIONS                            cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

[OCLICD/*]   CL_ICDL_NAME                                    OpenCL ICD Loader
[OCLICD/*]   CL_ICDL_VENDOR                                  OCL Icd free software
[OCLICD/*]   CL_ICDL_VERSION                                 2.2.11
[OCLICD/*]   CL_ICDL_OCL_VERSION                             OpenCL 2.1
keith@Serenity:~$