How to batch inference for lora

QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

7.43k stars 454 forks source link

How to batch inference for lora #380

Closed may012345 closed 2 months ago

may012345 commented 4 months ago

How to batch inference for lora

jklj077 commented 4 months ago

After loading the model, it should be similar to normal batch inference with transformers. Please also refer to https://github.com/QwenLM/Qwen1.5/issues/282#issuecomment-2051246019.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.