Closed dataai1205 closed 1 week ago
Finetuned Llama3.1 8b using Lora, but the model inference speed get extremely slow. How can i fix this ? What is the possible reasons ?
Please pay attention to model size and datatype or the model architecture.
Thanks, Solved it.
Finetuned Llama3.1 8b using Lora, but the model inference speed get extremely slow. How can i fix this ? What is the possible reasons ?