bentoml / BentoML

The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
https://bentoml.com
Apache License 2.0
6.98k stars 777 forks source link

Exceptions with internal BenchmarkClient while load testing AWS EC2 instance #1631

Closed Matthieu-Tinycoaching closed 2 years ago

Matthieu-Tinycoaching commented 3 years ago

Hello,

I am load testing AWS EC2 instance with one bentoML service containerized into a docker image and have often some disconnections or server errors. Please find below the exceptions I got using the internal BenchmarkClient :

╒════════════════════════════════════════════════╤═════════╕
│ exceptions                                     │   count │
╞════════════════════════════════════════════════╪═════════╡
│ ClientOSError(104, 'Connection reset by peer') │     274 │
├────────────────────────────────────────────────┼─────────┤
│ ServerDisconnectedError('Server disconnected') │      12 │
╘════════════════════════════════════════════════╧═════════╛

Would anyone have an advice about what these exceptions correspond to?

Environment:

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

parano commented 2 years ago

Hi @Matthieu-Tinycoaching, our team has decided to deprecate the BenchmarkClient in favor of locust https://docs.locust.io/en/stable/api.html

Matthieu-Tinycoaching commented 2 years ago

Hi @parano nice!

How to effectively load testing multiple EC2 instances behind a load balance then?

Since when I tested it with two EC2 instances behind a load balancer and locust, I got the same RPS than with a unique instance?

parano commented 2 years ago

@Matthieu-Tinycoaching it sounds like there maybe issue with the load balancer set up, could you confirm both replicas are receiving requests?

Matthieu-Tinycoaching commented 2 years ago

@parano yes both of replicas received requests, I checked it by sending requests via AWS Network Load Balancer (NLB) and looking at logs on each instance.