aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 318 forks source link

[Fargate] [request]: Provide the ability to use ebpf on fargate instances. #1027

Open KnoxAnderson opened 4 years ago

KnoxAnderson commented 4 years ago

Community Note

Tell us about your request Provide the ability to leverage ebpf for security and monitoring use cases on fargate

Which service(s) is this request for? ECS or EKS running on fargate

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

An eBPF program is "attached" to a designated code path in the kernel. When the code path is traversed, any attached eBPF programs are executed. Given its origin, eBPF is especially suited to writing network programs and it's possible to write programs that attach to a network socket to filter traffic, to classify traffic, and to run network classifier actions.

We'd want to attach eBPF programs to the following static tracepoints:

This allows the collection of -

Are you currently working around this issue? We are currently working around this issue by using ptrace which was exposed in fargate 1.4, but ebpf would be a more stable cross platform approach.

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

thomasdullien commented 4 years ago

On our end: We are using eBPF to both collect performance data, but also for general metrics / observability and debugging mystery issues that arise. Proper support for eBPF in fargate would be great!

bhiggins commented 4 years ago

We'd really like to see uprobe attachment supported as well. We instrument processes using eBPF attached to uprobes and we'd love to see this work with Fargate!

brendangregg commented 3 years ago

I've been asked a number of times to provide tests to show that eBPF observability works in a given environment, including for containers. Here is my smallest set of tests (each testing something different):

bpftrace -e 'BEGIN { printf("hello world\n"); }'
bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s\n", comm, str(args->filename)); }'
bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; @bsize[arg2]++; } kretprobe:vfs_read /@start[tid]/ { @ns[comm] = hist(nsecs - @start[tid]); delete(@start[tid]); }'
bpftrace -e 'ur:/bin/bash:readline { printf("got: %s\n", str(retval)); }'
bpftrace -e 'profile:hz:9 { @[kstack, ustack] = count(); }'

For more tests, I'd try the tools in https://github.com/iovisor/bcc and https://github.com/iovisor/bpftrace. Note that the bcc tools are evolving into the versions in the libbpf-tools directory, which produce C binaries with no dependencies (no LLVM) as they contain embedded BPF bytecode. Also note that I'm discussing observability only here.

Making everything work in containers securely (and, once the bpf and perf_event_open syscalls are allowed, avoiding memory reads of other container info) will be quite some work.

soo-o commented 1 year ago

Hi folks, please pardon the extended delay in response for this issue. This is something we're actively working on (we've updated the status to "Coming Soon" rather than "Proposed"). In an incoming release, Fargate will make it possible to add specific Linux capabilities to the task containers, including CAP_BPF and CAP_PERFMON. Fargate will exclusively support BPF CO-RE applications.

inge4pres commented 1 year ago

In an incoming release, Fargate will make it possible to add specific Linux capabilities to the task containers, including CAP_BPF and CAP_PERFMON.

Thanks for the update @soo-o 🙏🏼 Do you already know the full list of capabilities that will be allowed?

I believe that since https://github.com/aws/containers-roadmap/issues/1000 stays open, and given the security considerations on Fargate, CAP_SYS_ADMIN is off the table?

NyanHelsing commented 1 year ago

I'd be super stoked to see this functionality landed

fntlnz commented 1 year ago

@soo-o just saw that this is coming - can't wait! If y'all need someone to test it at some point I'm down for it.

lmb commented 1 year ago

Fargate will exclusively support BPF CO-RE applications.

@soo-o could you explain what this means? CO-RE can be done in user space (libbpf, cilium/ebpf) and in kernel space. Does this imply that CO-RE will have to be done in the kernel?

jgoeres commented 1 year ago

Is there an ETA for this feature? We are highly interested in it since we plan to use Cilium to support network policies and want to run some of our workloads on Fargate.

rajeshkundwani commented 1 year ago

Is there an ETA to enable the use of eBPF on Fargate? It is currently blocking us to move onto Graviton for the task integrated with our security tool as the only way to read kernel level events in Graviton is eBPF (please correct if this understand is wrong).

reskin89 commented 5 months ago

Is there any update to this?

imreACTmd commented 3 months ago

Checking in on the progress of this

marcofranssen commented 3 months ago

I see it has been 4 years since this issue was opened. With Cilium being the defacto CNI these adays this is quite essential for customers to allow using Cilium in EKS clusters that also utilize Fargate. @AWS any status update on this?

sjoukedv commented 3 months ago

I would say just forget about this being implemented and switch to EC2, so you can use Cilium (have it replace kube-proxy and the aws-vpc-cni). EC2 has more advantages like EFS CSI driver dynamic provisioning, faster container startup times (making HPA actually useful), ability to use DaemonSets, etc.

NyanHelsing commented 2 months ago

maybe only tangentially related.. but if windows supported something like ebpf maybe this crowdstrike thing wouldn't have happened. Just a thought maybe it's worth prioritizing wider ebpf support including in fargate.

t0mbombadil commented 2 months ago

Lack of this feature - actually stops my company plans to use Elastic Agent integration - https://www.elastic.co/docs/current/en/integrations/cloud_defend and probably usage of other security tools for inpecting events on kubernetes containers.

And together with lack of working AWS GuardDuty Runtime Monitoring on FarGate backed EKS - for our use case (and probably a lot of others) it leaves Fargate EKS clusters without any security-oriented runtime monitoring.