Open sverneka opened 1 year ago
@stas00 Can you please help me with this? Thanks!
I'm not quite sure why you're tagging me here as I am not part of this project and I have no idea what code you're talking about.
If it's a transformers
question please ask https://github.com/huggingface/transformers/issues and give full context of the issue.
Thank you.
The same issue occurs while fine-tuning Flan-T5 with LoRA and bnb int-8 on a summarisation dataset using 1 A100 40G. It takes a long time for inference while the training is very fast. Any solution? Thank you!
This doesn't seem like an issue to me. Have you tried running inference after the training is done? and adjusted the parameters?
having seen the same problem and I got warning messages like:
Invalidate trace cache @ step 0: expected module 2, but got module 0
I tried to run the code as it is for training and at the end of each epoch it does inference on test set, I found that it was taking too long for inference and the GPU memory and utilization was getting maxed out on p4dn.24x that has 8 A100s, 40GB. Surprisingly the training was much faster than inference! Any idea how to fix this? Thanks!