AttributeError: 'LlamaForCausalLM' object has no attribute 'setup_caches'

ohashi3399 commented 7 months ago

Thanks for your valuable efforts for implementing such tricks! I faced an error as you see in the title using following code. Does everyone use setup_cachesmethod? I am suspicious that I use wrong way. My environments are belows:

transformers==4.35.2

torch==2.0.1+cu117


model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
trust_remote_code=True,
load_in_4bit=True,
torch_dtype="auto",
use_cache=True,
)

model = torch.compile(model, mode="reduce-overhead")

with torch.device(model.device): model.setup_caches( max_batch_size=1, max_seq_length=512, )


thanks in advance.

ohashi3399 commented 7 months ago

I confirmed that model = torch.compile(model, mode="reduce-overhead") degrades generation quality. I removed it.

Chillee commented 7 months ago

You can't use an arbitrary model with generate - you need to use the model in model.py.

ohashi3399 commented 7 months ago

i got it, Chillee-san! my mistake was solved by your advice! I appreciate for your help.

pytorch-labs / gpt-fast

AttributeError: 'LlamaForCausalLM' object has no attribute 'setup_caches' #11