H100 Support - Githubissues

spcl / QuaRot

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Apache License 2.0

278 stars 20 forks source link

Does code in this repo support H100? I'm getting this error when trying to run it on an H100:

"/home/carlguo/QuaRot/quarot/nn/linear.py", line 50, in forward
    x = quarot.matmul(x, self.weight)
  File "/home/carlguo/QuaRot/quarot/__init__.py", line 41, in matmul
    return quarot._CUDA.matmul(A, B).view(*A_shape_excl_last, *B_shape_excl_last)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

spcl / QuaRot

H100 Support #48