Closed ohashi3399 closed 7 months ago
I confirmed that model = torch.compile(model, mode="reduce-overhead")
degrades generation quality. I removed it.
You can't use an arbitrary model with generate
- you need to use the model in model.py
.
i got it, Chillee-san! my mistake was solved by your advice! I appreciate for your help.
Thanks for your valuable efforts for implementing such tricks! I faced an error as you see in the title using following code. Does everyone use
setup_caches
method? I am suspicious that I use wrong way. My environments are belows:model = torch.compile(model, mode="reduce-overhead")
with torch.device(model.device): model.setup_caches( max_batch_size=1, max_seq_length=512, )