WIP: Fix autotune not using device

fpgaminer / GPTQ-triton

GPTQ inference Triton kernel

Apache License 2.0

284 stars 23 forks source link

Closed Qubitium closed 1 year ago

Qubitium commented 1 year ago

Trying to fix triton model load on multi-gpu via cuda:0, cuda:1, etc. Please note this pr is not complete and does not resolve the underlying issue.