fpgaminer / GPTQ-triton

GPTQ inference Triton kernel
Apache License 2.0
284 stars 23 forks source link

WIP: Fix autotune not using device #19

Closed Qubitium closed 1 year ago

Qubitium commented 1 year ago

Trying to fix triton model load on multi-gpu via cuda:0, cuda:1, etc. Please note this pr is not complete and does not resolve the underlying issue.