kubearmor / KubeArmor

Runtime Security Enforcement System. Workload hardening/sandboxing and implementing least-permissive policies made easy leveraging LSMs (BPF-LSM, AppArmor).
https://kubearmor.io/
Apache License 2.0
1.46k stars 337 forks source link

DNS Visibility with KubeArmor #1219

Open nyrahul opened 1 year ago

nyrahul commented 1 year ago

Visibility into the domains accessed from the pods are important. Attackers use techniques such as DGA (Domain Generation Algorithms) to connect to remote C&C (command and control) servers to disguise themselves. Getting visibility into what domains are accessed and then applying network rules to enable specific domains only could help secops folks to contain such attacks.

Ref

Requirements:

Tasks:

Deliverables:

Swapnil-2502 commented 1 year ago

Does the applicant need to have prior contributions to KubeArmor for applying to the LFX Mentorship, as I will be investing a lot of time in reading documentation to understand the issue.

nyrahul commented 1 year ago

Does the applicant need to have prior contributions to KubeArmor for applying to the LFX Mentorship, as I will be investing a lot of time in reading documentation to understand the issue.

It is not a strict requirement but it certainly helps.

mahesh-hegde commented 1 year ago

If I am not wrong, the most practical way to do this is analysis of network packets right?

Apart from plaintext DNS, what about DNS over HTTPS? From top of google search

With DoH, both the DNS queries and DNS responses are transmitted over HTTPS and use port 443, making the traffic virtually indistinguishable from any other HTTPS web traffic.

nyrahul commented 1 year ago

If I am not wrong, the most practical way to do this is analysis of network packets right?

Apart from plaintext DNS, what about DNS over HTTPS? From top of google search

With DoH, both the DNS queries and DNS responses are transmitted over HTTPS and use port 443, making the traffic virtually indistinguishable from any other HTTPS web traffic.

DoH/DoT or any other encrypted DNS is not considered. Usually those techniques are used in north south traffic. The aim of this project is to handle k8s pods DNS visibility which is mostly unencrypted DNS over UDP. Please let know if you think DoH is used in k8s deployments.

nikhilchauhangithub commented 1 year ago

we could involve using a sidecar container that intercepts DNS traffic or using a network tap to capture all DNS traffic

nyrahul commented 1 year ago

we could involve using a sidecar container that intercepts DNS traffic or using a network tap to capture all DNS traffic

Sidecar would be an overkill in this case especially since we need to only capture the trace and decrypt the DNS flow but not enforce (redirect/drop) the traffic.

The intention is to use eBPF natively to decrypt the UDP packets in-kernel with certain destination port (port 53) and present the looked up domains as telemetry events.

nyrahul commented 1 year ago

How will the collected DNS lookup data be handled to ensure privacy and security?

We do not have "user information" at the level where KubeArmor will be decoding the DNS traffic. The DNS traffic originated by the workloads would be intercepted. It won't be possible to couple user information with the DNS traffic information. Regarding security, the DNS traffic currently goes unencrypted. Unfortunately, bulk of DNS traffic flows unencrypted today and this problem statement is handled by DoH, and DNSSec. However, these techs haven't found their way into k8s env as of now (or atleast I am not aware of it).

Will capturing DNS visibility impact the resource consumption of KubeArmor

Yes it will. There is always a performance impact of getting visibility. Question is, how can we keep the impact limited? Fortunately, KubeArmor is in a good position to keep the impact reduced by doing most of the processing inkernel and this could be our differentiation.

h4shk4t commented 1 year ago

Hello @nyrahul ,

My name is Ashutosh Srivastava and I am an undergraduate from IIT Roorkee. I am very interested in this project. I have a background of Information Security and DevOps. As a part of InfoSecIITR (Ranked 4 in India), I have participated and won many CTFs and security competitions such as CSAW CTF, CSAW ESC, ImaginaryCTF, etc. I specialise in Cloud Security and adverserial attacks against machine learning. I have previously worked on creating botnets (educational purposes) with C&C servers as well. As a developer I also have experience with creating applications on Kubernetes and Go (working on an automatic A&D CTF platform, similar to CTFd but for A&D CTFs) and my main work was creating abstraction of pods to prevent security breaches as a result of container jailbreak.

As far as I understand the given problem, we need to implement an eBPF based DNS tracing on the UDP port to capture all the unencrypted (encrypted as well?) packets (DNS queries and responses) and emit DNS queries via telemetry events. We would also have to handle all these telemetry events with the help of KubeArmor and flag all suspicious queries. And we have to do all this while having a minimal impact on the performance.

nyrahul commented 1 year ago

As far as I understand the given problem, we need to implement an eBPF based DNS tracing on the UDP port to capture all the unencrypted (encrypted as well?) packets (DNS queries and responses) and emit DNS queries via telemetry events.

No encrypted traffic will be handled, otherwise the stated goal is correct. Parse DNS queries/responses and emit as telemetry events.

We would also have to handle all these telemetry events with the help of KubeArmor and flag all suspicious queries. And we have to do all this while having a minimal impact on the performance.

Flagging of suspicious events is not part of this task. Generally, users achieve flagging by redirecting KubeArmor telemetry events to SIEM/other threat intelligence platforms. Yep, performance is a major factor. The intention is to achieve best performance using inkernel processing leveraging eBPF.

vishalrajofficial commented 1 year ago

I am Vishal Raj, a final year CSE student. I'm excited to contribute to the KubeArmor project, implementing DNS visibility for egress traffic from any pods. Let's work together to enhance KubeArmor and provide valuable DNS visibility to secops, helping them detect and contain potential attacks effectively!

clemenkok commented 1 year ago

Clemen here - 3rd year MEng Electronic and Information Engineering at Imperial College. This whole aspect of performance optimisation when doing egress DNS query monitoring with eBPF is extremely interesting. Just re-uploaded my cover letter with more detail on my technical considerations and approach. Would love to work on this as part of the LFX mentorship or on other areas of the project involving eBPF. I've a background in DevOps, SysAdmin and Cybersecurity. Happy to chat more if you'd like.

jmcgrath207 commented 1 year ago

I am interested in applying for this project since I want to grow my skillsets as an open-source contributor by working on larger projects and with eBPF.

Currently, I have an open-source project that is a label-based DNS operator. That allows you to rewrite DNS queries at a Pod Level without node configuration. So I do have some familiarity in this domain.

https://github.com/jmcgrath207/par

As for metrics, I've also worked on a Prometheus exporters for Ephemeral Storage metrics. https://github.com/jmcgrath207/k8s-ephemeral-storage-metrics

harisudarsan1 commented 8 months ago

@nyrahul Is it okay to work on this issue I almost finished it. Im able to get the dns socket fd just retrieving the query and emitting as telemetry event is remaining.

nyrahul commented 8 months ago

Absolutely, please raise a draft pr that would help us understand the approach taken and ensure early feedback. Thank you for your interest.

harisudarsan1 commented 8 months ago

@nyrahul I created a draft pr. Iam able to retrieve the iov_base char array but Iam not able to get the query I need some help in that. I have implemented dns visibility using kprobes by modifying code in system_monitor.c is it okay or should i need to have a seperate bpf program specifically for dns visibility.

DelusionalOptimist commented 6 months ago

Future: