After finetuning llama3.1 8b, the inference speed gets slow.

meta-llama / llama-models

Utilities intended for use with Llama models.

Other

4.88k stars 838 forks source link

Closed dataai1205 closed 1 week ago

dataai1205 commented 1 week ago

Finetuned Llama3.1 8b using Lora, but the model inference speed get extremely slow. How can i fix this ? What is the possible reasons ?

aitechguy0105 commented 1 week ago

Please pay attention to model size and datatype or the model architecture.

dataai1205 commented 1 week ago

Thanks, Solved it.