Closed may012345 closed 2 months ago
After loading the model, it should be similar to normal batch inference with transformers
. Please also refer to https://github.com/QwenLM/Qwen1.5/issues/282#issuecomment-2051246019.
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
How to batch inference for lora