pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.36k stars 485 forks source link

What is `torch.ops.aten._convert_weight_to_int4pack` ? #58

Closed vgoklani closed 5 months ago

vgoklani commented 6 months ago

I'm using torch.version = 2.1.0a0+32f93b1

which doesn't have

AttributeError: '_OpNamespace' 'aten' object has no attribute '_convert_weight_to_int4pack'

What exactly does this do, and is it defined elsewhere?

Unfortunately upgrading to the latest torch-dev breaks flash-attention2

yifuwang commented 6 months ago

One possible reason for encountering this issue could be the use of a CPU build. Could you check torch.cuda.is_available() just to be sure?

vgoklani commented 6 months ago

Thanks for the response:

torch.cuda.is_available() = True

No problems training either

Also note I'm using this docker image:

nvcr.io/nvidia/pytorch:23.10-py3

yifuwang commented 6 months ago

Ah you need the pytorch nightly build for this repo.

ZipECHO commented 6 months ago

Have you solved this problem? I got same problem when I quantize llama 7b int4.

HDCharles commented 5 months ago

you should use the nightly version of torch or at least the recent 2.2 branch cut, its a newish op that was added for int4 support.