aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 320 forks source link

[Fargate] [request]: compute optimized options #1030

Open lifeofguenter opened 4 years ago

lifeofguenter commented 4 years ago

In conjunction with https://github.com/aws/containers-roadmap/issues/715 it would be great to be able to opt-in explicitly for compute optimized underlying fargate hosts.

This could also be something done implicitly when selecting a certain cpu/mem combo. Currently we are experiencing degraded performance with certain types of combinations in the region of 3-5x (negative). We would like for certain workloads to have a more predictable performance vertically.

See also: https://github.com/aws/containers-roadmap/issues/164#issuecomment-562456500

ravishtiwari commented 2 years ago

It would be really nice to have this feature where users are able to choose or customize different options. It would make it more suitable for additional workloads that can't consistently run on Fargate atm.

:+1:

zachcasper commented 2 years ago

Hey folks, Zach here from the Fargate PM team. I'm working on Fargate's direction around performance. I'm very interested in knowing a few more details here. A few questions:

  1. Is this feature more about processing consistency at the same performance level, or no you have workloads which are CPU bound and could benefit for higher performing processors?
  2. What are some use cases and workloads which could benefit from this?
  3. If you are CPU bound, is it I/O, integer, or floating point performance?

Thanks!

lifeofguenter commented 2 years ago

Is this feature more about processing consistency at the same performance level, or no you have workloads which are CPU bound and could benefit for higher performing processors?

Both. Back then as put in the initial description we were experiencing a high fluctuation depending on when we launch a Fargate instance / the size of allocated RAM.

The issue with high rps services / canary deployments / multiple deployments per day, you kinda want a very predictable outcome in your stats when deploying and not have numbers (response times, number of req/s that a task can handle) highly fluctuating as that would give a bad signal the developers if a deployment is botched or not.

What are some use cases and workloads which could benefit from this?

Most important is consistency, and also performance that is on-par with current gen CPUs. In a few cases we did processing (encryption, zip, image processing, etc.) that where a higher single-thread benefit (solved with current gen CPUs) results in lower processing time = lower response times. These are usually tasks that by design can not scale-out, hence single-thread performance is important.

If you are CPU bound, is it I/O, integer, or floating point performance?

No idea, but definitely not I/O issue.