microsoft / BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
MIT License
190 stars 21 forks source link

GH200 Support #60

Closed sidereior closed 6 days ago

sidereior commented 1 week ago

Hi all,

After building from source and resolving some dependency issues, it appears as though BitBlas does not support the GH200 within its list of available targets.

Here is the list of available targets which was observed: +-------+-----------------------------------+ | Index | Target | +-------+-----------------------------------+ | 1 | nvidia/geforce-gtx-765m | | 2 | nvidia/geforce-gtx-460 | | 3 | nvidia/geforce-rtx-2060 | | 4 | nvidia/quadro-k510m | | 5 | nvidia/nvs-4200m | | 6 | nvidia/geforce-gtx-860m-sm-30 | | 7 | nvidia/quadro-rtx-5000 | | 8 | nvidia/geforce-gtx-880m | | 9 | nvidia/geforce-rtx-3090 | | 10 | nvidia/geforce-gtx-750-ti | | 11 | nvidia/geforce-gtx-950m | | 12 | nvidia/geforce-gt-640m | | 13 | nvidia/geforce-gtx-titan-z | | 14 | nvidia/quadro-p5000 | | 15 | nvidia/geforce-gt-640m-le | | 16 | nvidia/geforce-gtx-770m | | 17 | nvidia/tegra-x1 | | 18 | nvidia/p520 | | 19 | nvidia/geforce-gt-720m | | 20 | nvidia/quadro-k620m | | 21 | nvidia/geforce-gt-630 | | 22 | nvidia/quadro-p3000 | | 23 | nvidia/geforce-800m | | 24 | nvidia/geforce-gtx-980 | | 25 | nvidia/quadro-k2000d | | 26 | nvidia/geforce-gtx-660m | | 27 | nvidia/geforce-gtx-560-ti | | 28 | nvidia/geforce-gt-415m | | 29 | nvidia/quadro-m2200 | | 30 | nvidia/geforce-gtx-1070 | | 31 | nvidia/geforce-gt-520m | | 32 | nvidia/geforce-gtx-480m | | 33 | nvidia/geforce-gt-430 | | 34 | nvidia/tesla-p100 | | 35 | nvidia/quadro-m6000-24gb | | 36 | nvidia/quadro-plex-7000 | | 37 | nvidia/geforce-gtx-675mx | | 38 | nvidia/geforce-rtx-3070 | | 39 | nvidia/geforce-gtx-690 | | 40 | nvidia/nvs-5400m | | 41 | nvidia/geforce-gtx-465 | | 42 | nvidia/jetson-tx2 | | 43 | nvidia/geforce-gtx-760 | | 44 | nvidia/quadro-k4100m | | 45 | nvidia/nvidia-nvs-510 | | 46 | nvidia/rtx-4000 | | 47 | nvidia/geforce-910m | | 48 | nvidia/geforce-gtx-580 | | 49 | nvidia/tesla-p4 | | 50 | nvidia/quadro-k2100m | | 51 | nvidia/geforce-gt-730-ddr3,128bit | | 52 | nvidia/geforce-gtx-650 | | 53 | nvidia/quadro-m3000m | | 54 | nvidia/geforce-gtx-680mx | | 55 | nvidia/nvidia-h100 | | 56 | nvidia/geforce-gt-645m | | 57 | nvidia/quadro-p500 | | 58 | nvidia/geforce-gtx-470 | | 59 | nvidia/geforce-gt-520mx | | 60 | nvidia/quadro-m1000m | | 61 | nvidia/geforce-gt-750m | | 62 | nvidia/geforce-gtx-980-ti | | 63 | nvidia/nvidia-v100 | | 64 | nvidia/quadro-gp100 | | 65 | nvidia/tesla-m60 | | 66 | nvidia/tesla-k80 | | 67 | nvidia/nvidia-a10g | | 68 | nvidia/geforce-gt-445m | | 69 | nvidia/geforce-gtx-960m | | 70 | nvidia/geforce-gt-740 | | 71 | nvidia/geforce-gtx-titan | | 72 | nvidia/quadro-p5200 | | 73 | nvidia/geforce-gtx-485m | | 74 | nvidia/geforce-gtx-780m | | 75 | nvidia/geforce-gt-550m | | 76 | nvidia/quadro-gv100 | | 77 | nvidia/geforce-gt-525m | | 78 | nvidia/geforce-gtx-850m | | 79 | nvidia/geforce-gt-640-gddr5 | | 80 | nvidia/geforce-820m | | 81 | nvidia/tesla-m40 | | 82 | nvidia/geforce-710m | | 83 | nvidia/quadro-k500m | | 84 | nvidia/geforce-gtx-670m | | 85 | nvidia/quadro-m500m | | 86 | nvidia/quadro-k5000 | | 87 | nvidia/nvidia-a100 | | 88 | nvidia/geforce-gt-620 | | 89 | nvidia/quadro-p3200 | | 90 | nvidia/quadro-m5000 | | 91 | nvidia/geforce-gtx-650-ti-boost | | 92 | nvidia/geforce-gt-640-gddr3 | | 93 | nvidia/quadro-m620 | | 94 | nvidia/geforce-gtx-560m | | 95 | nvidia/quadro-k5200 | | 96 | nvidia/geforce-gtx-950 | | 97 | nvidia/quadro-k420 | | 98 | nvidia/geforce-gtx-1070-ti | | 99 | nvidia/quadro-k620 | | 100 | nvidia/geforce-gtx-770 | | 101 | nvidia/nvidia-a40 | | 102 | nvidia/quadro-k5200m | | 103 | nvidia/quadro-m5500m | | 104 | nvidia/geforce-rtx-2080-ti | | 105 | nvidia/geforce-gt-755m | | 106 | nvidia/geforce-gtx-590 | | 107 | nvidia/quadro-k610m | | 108 | nvidia/geforce-rtx-2070 | | 109 | nvidia/geforce-gtx-965m | | 110 | nvidia/geforce-gtx-660 | | 111 | nvidia/quadro-rtx-8000 | | 112 | nvidia/rtx-3000 | | 113 | nvidia/tesla-k40 | | 114 | nvidia/geforce-gtx-860m-sm-50 | | 115 | nvidia/geforce-gtx-480 | | 116 | nvidia/nvidia-nvs-315 | | 117 | nvidia/geforce-gt-555m | | 118 | nvidia/geforce-930m | | 119 | nvidia/quadro-k1200 | | 120 | nvidia/quadro-rtx-6000 | | 121 | nvidia/geforce-gt-630m | | 122 | nvidia/geforce-gt-635m | | 123 | nvidia/quadro-m5000m | | 124 | nvidia/geforce-gtx-675m | | 125 | nvidia/geforce-gtx-970m | | 126 | nvidia/t2000 | | 127 | nvidia/geforce-gt-740m | | 128 | nvidia/geforce-gtx-1080 | | 129 | nvidia/geforce-gtx-1050 | | 130 | nvidia/nvidia-titan-xp | | 131 | nvidia/nvidia-titan-v | | 132 | nvidia/quadro-rtx-4000 | | 133 | nvidia/jetson-tx1 | | 134 | nvidia/geforce-gt-435m | | 135 | nvidia/geforce-gt-730 | | 136 | nvidia/tesla-k20 | | 137 | nvidia/geforce-gt-705 | | 138 | nvidia/quadro-p4000 | | 139 | nvidia/geforce-gt-540m | | 140 | nvidia/geforce-gtx-680m | | 141 | nvidia/geforce-gtx-550-ti | | 142 | nvidia/gtx1080ti | | 143 | nvidia/geforce-rtx-3080 | | 144 | nvidia/tesla-c2050 | | 145 | nvidia/quadro-p2000 | | 146 | nvidia/geforce-gt-620m | | 147 | nvidia/quadro-p6000 | | 148 | nvidia/nvidia-nvs-810 | | 149 | nvidia/geforce-gtx-570m | | 150 | nvidia/geforce-gtx-titan-x | | 151 | nvidia/geforce-gtx-960 | | 152 | nvidia/geforce-rtx-4070-ti | | 153 | nvidia/jetson-nano | | 154 | nvidia/geforce-gt-610 | | 155 | nvidia/rtx-a6000 | | 156 | nvidia/quadro-m1200 | | 157 | nvidia/geforce-gt-420m | | 158 | nvidia/geforce-gtx-780 | | 159 | nvidia/geforce-gtx-460m | | 160 | nvidia/geforce-gtx-1060 | | 161 | nvidia/nvidia-a16 | | 162 | nvidia/quadro-k6000m | | 163 | nvidia/geforce-rtx-3090-ti | | 164 | nvidia/tesla-p40 | | 165 | nvidia/geforce-940m | | 166 | nvidia/geforce-gts-450 | | 167 | nvidia/geforce-gtx-670 | | 168 | nvidia/p620 | | 169 | nvidia/quadro-k3100m | | 170 | nvidia/nvs-5200m | | 171 | nvidia/geforce-gtx-780-ti | | 172 | nvidia/geforce-gt-745m | | 173 | nvidia/geforce-610m | | 174 | nvidia/geforce-rtx-3060 | | 175 | nvidia/quadro-p600 | | 176 | nvidia/quadro-k1100m | | 177 | nvidia/quadro-m2000m | | 178 | nvidia/quadro-k2000 | | 179 | nvidia/quadro-k4000 | | 180 | nvidia/geforce-410m | | 181 | nvidia/quadro-p400 | | 182 | nvidia/geforce-gtx-980m | | 183 | nvidia/quadro-m520 | | 184 | nvidia/quadro-m4000m | | 185 | nvidia/geforce-gt-650m | | 186 | nvidia/nvidia-titan-rtx | | 187 | nvidia/geforce-gt-625m | | 188 | nvidia/geforce-rtx-4090 | | 189 | nvidia/geforce-gtx-660-ti | | 190 | nvidia/geforce-gtx-870m | | 191 | nvidia/quadro-k5100m | | 192 | nvidia/geforce-gt-730m | | 193 | nvidia/geforce-705m | | 194 | nvidia/quadro-p4200 | | 195 | nvidia/tesla-c2070 | | 196 | nvidia/geforce-920m | | 197 | nvidia/quadro-k6000 | | 198 | nvidia/quadro-k2200m | | 199 | nvidia/geforce-gtx-760m | | 200 | nvidia/quadro-p2200 | | 201 | nvidia/geforce-gt-720 | | 202 | nvidia/nvidia-a2 | | 203 | nvidia/quadro-k4200m | | 204 | nvidia/nvidia-t4 | | 205 | nvidia/geforce-gtx-580m | | 206 | nvidia/geforce-gtx-970 | | 207 | nvidia/geforce-rtx-2080 | | 208 | nvidia/geforce-gt-440 | | 209 | nvidia/geforce-rtx-3070-ti | | 210 | nvidia/nvidia-nvs-310 | | 211 | nvidia/geforce-gtx-650-ti | | 212 | nvidia/geforce-gt-520 | | 213 | nvidia/quadro-m2000 | | 214 | nvidia/geforce-840m | | 215 | nvidia/t1000 | | 216 | nvidia/geforce-gtx-470m | | 217 | nvidia/tesla-c2075 | | 218 | nvidia/quadro-m4000 | | 219 | nvidia/rtx-5000 | | 220 | nvidia/tesla-k10 | | 221 | nvidia/geforce-gtx-680 | | 222 | nvidia/nvidia-a30 | | 223 | nvidia/geforce-gtx-1080-ti | | 224 | nvidia/geforce-gtx-titan-black | | 225 | nvidia/geforce-rtx-3080-ti | | 226 | nvidia/geforce-gtx-750 | | 227 | nvidia/geforce-830m | | 228 | nvidia/quadro-k4200 | | 229 | nvidia/nvidia-a10 | | 230 | nvidia/quadro-p620 | | 231 | nvidia/geforce-gtx-570 | | 232 | nvidia/quadro-m600m | | 233 | nvidia/quadro-410 | | 234 | nvidia/quadro-m6000 | | 235 | nvidia/quadro-k2200 | | 236 | nvidia/geforce-gt-735m | | 237 | nvidia/geforce-gtx-670mx | | 238 | nvidia/quadro-p1000 | | 239 | nvidia/quadro-k600 | | 240 | nvidia/nvidia-titan-x | +-------+-----------------------------------+

thanks for the help!

LeiWang1999 commented 1 week ago

Hi @sidereior , the registered targets you list there provided extra informations for search space generation (for example, the shared memory capacity, the L2 cache capacity), if target is not detected, a default cuda target will be used, which maintains some default configurations. Therefore, you can still use BitBlas even if the target is not registered.