gw-cs-sd / sd-2017-team-ddos

sd-2017-team-ddos created by GitHub Classroom
0 stars 0 forks source link

Week 30: Finishing the naive Load Balancer #5

Open mrdude opened 7 years ago

mrdude commented 7 years ago

@twood02

What have I been doing?

I've mostly been working on fixing bugs in my naive load balancer implementation. This "load balancer" just forwards all connections to the first backend. As of commit 0d7865, this is mostly done. Athena will happily accept and forward Vegeta's connections for a while. After about ~20k packets or so, Vegeta will start reporting that it's connections are being rejected. I dunno what is going on yet; I'm still in the process of scanning the pcaps in Wireshark.

Milestones

What am I doing this week?

Once the naive LB implementation works, I'm going to move on to implementing pluggable load balancers. By the end of this week, I want to have implementations for Naive, Round Robin, and Least Connections done.

Once I have all of these algorithms implemented, I can compare their performance using Vegeta's reported stats. This will be good to have in my presentation; I can create a graph comparing 95%ile latency for each alg.

Other Misc TODO items

Potential Roadblocks

In the current implementation, Athena assumes that it has the same IP as its load balancer backends. Because of this, Athena doesn't have to know ARP; Athena just blindly forwards any non-TCP packets it gets, and the networking stack in the backend's kernel handles everything else.

Ideally, one would be able to run Athena on a server as a reverse proxy for a cluster of web servers. This would require Athena to: 1) respond to ARP requests for its IP, and 2) read incoming ARP packets so that Athena can patch the ethernet headers (as well as the TCP and IP headers) while routing.

I'd really rather not have to write code to understand ARP right now; considering how long it took me to iron out the bugs in TCP replay, I don't have time to get Athena to understand another protocol (even one as relatively straightforward as ARP). For the time being, Athena is going to assume that it shares an IP with it's backends. If I still have time after implementing the rest of my intended milestone features, I'll add ARP support to Athena.

mrdude commented 7 years ago

Here are the pcaps I've been looking at: nn27, nn29. Nimbnode27 has an IP of 11.0.0.27 and hosts the backend webservers at 11.0.0.27:81, :85, :90, :95, and :100. Nimbnode29 has an IP of 11.0.0.29 and sends the client requests. Athena runs on Nimbnode28 and routes all packets that pass between nn27 and nn29.

I've been looking at them in Wireshark. Everything seems to be working nicely until this point:

Wireshark screencapLink to larger image

Wireshark notes that TCP port numbers are beginning to be reused; I suspect that this is causing Athena to get confused about connection states.

twood02 commented 7 years ago

Yes, I was going to ask about this -- at some point ports will be reused, so if you aren't clearing out old connections that is likely to be an issue. Detecting the close of a connection can be a bit tricky (ordering of FIN/RST isn't always consistent). For your purposes it may be fine to just detect when a port is being reused (new SYN) and recognize that means you need to reset your state machine for that connection.