hpcaitech / EnergonAI

Large-scale model inference.
Apache License 2.0
631 stars 90 forks source link

How to use dynamic batch features #199

Open hudengjunai opened 1 year ago

hudengjunai commented 1 year ago

Hello, I have launched the opt-125M inference, and send request to server with locust. but what ever config the max_batch_size, the InferenceEngine always run in batch_size =1. how can i use the dynamic batch feature in Batch_server_manager?

ver217 commented 1 year ago

It will make batch only when a sequence of input can be batched, e.g. when they have same generation steps.