hpcaitech / EnergonAI

Large-scale model inference.
Apache License 2.0
630 stars 90 forks source link

[engine] Async engine and pipeline based on RPC #157

Closed ver217 closed 2 years ago

ver217 commented 2 years ago

Implement #151

ver217 commented 2 years ago

Locust test results for OPT-30B with 4x A100: Old: image

New: image