pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.36k stars 485 forks source link

AttributeError: torch._inductor.config.fx_graph_cache does not exist #51

Open chinmay29 opened 6 months ago

chinmay29 commented 6 months ago

Quantize the model to int8 and it gave this error:

ubuntu@ip-172-31-19-240:~/gpt-fast$ python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int8 Loading model ... /opt/conda/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)() Quantizing model weights for int8 weight-only symmetric per-channel quantization Writing quantized weights to checkpoints/openlm-research/open_llama_7b/model_int8.pth Quantization complete took 24.35 seconds

ubuntu@ip-172-31-19-240:~/gpt-fast$ python generate.py --compile --checkpoint_path checkpoints/$MODEL_REPO/model_int8.pth Traceback (most recent call last): File "/home/ubuntu/gpt-fast/generate.py", line 18, in torch._inductor.config.fx_graph_cache = True # Experimental feature to reduce compilation times, will be on by default in future File "/opt/conda/lib/python3.10/site-packages/torch/_dynamo/config_utils.py", line 72, in setattr raise AttributeError(f"{self.name}.{name} does not exist") AttributeError: torch._inductor.config.fx_graph_cache does not exist

System:

Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-1049-aws x86_64v)

thakkarparth007 commented 6 months ago

Did you run with pytorch nightly?