issues
search
hpcaitech
/
EnergonAI
Large-scale model inference.
Apache License 2.0
630
stars
90
forks
source link
[opt] executor update making batch policy
#133
Closed
ver217
closed
2 years ago
ver217
commented
2 years ago
Make sure requests are FIFO.
Requests whose decode steps <= those of queue head can be grouped into a batch.