aws-samples / bedrock-access-gateway

OpenAI-Compatible RESTful APIs for Amazon Bedrock
MIT No Attribution
119 stars 22 forks source link

Issue with concurrent requests on AWS Fargate #22

Open eliran89c opened 1 week ago

eliran89c commented 1 week ago

Describe the Bug I am encountering an issue where concurrent requests are being processed sequentially rather than simultaneously when deployed on AWS Fargate. I suspect the problem is that boto3 runs synchronously, and its calls are blocking.

API Details

To Reproduce Steps to reproduce the behavior:

  1. Deploy the service on AWS Fargate following the standard setup procedures.
  2. Send multiple concurrent requests (e.g., 10 concurrent requests) to the API.
  3. Observe that the requests are processed sequentially instead of concurrently.

Expected Behavior I expected that when sending multiple concurrent requests to the API, all requests would be handled simultaneously or at least as many as the server can handle

daixba commented 1 week ago

Concurrency and asynchronous call is natively supportted by FastAPI, I did a quick test with 2 concurrency requests (with long response) and I can see both are streaming in parallel, I didn't test via code though.

You can probably try below:

  1. Try fewer requests (like 2 requests) first and see if the issue still exists.
  2. Try to test in local (The code can run locally)
  3. Try to increase the capacity of Fargate (By default, it has only 1 core, I would expect it may not support larger concurrent requests) and retest
eliran89c commented 1 week ago

Hi @daixba, I forgot to mention that I'm not streaming the response With streaming, it works better, but it is still not perfect (I monitor the health-check endpoint, and it times out from time to time)

But without streaming, the API is waiting for each request to finish before being able to handle other requests

Concurrency and asynchronous call is natively supported by FastAPI

I agree; This is why I think the problem with boto3

eliran89c commented 1 week ago

@daixba when I run boto3 with asyncio it's working as expected https://github.com/aws-samples/bedrock-access-gateway/pull/23