System Design Interviews

Mirodil commented 3 years ago

Create a video lessons

System Design Basics
Load Balancing
- Introduction
- Where can a load balancer be placed in a system ?
- Types of Load balancers
- Load Balancing Strategies and Load Types
- Hardware load balancers
- Software load balancers
- Algorithms/Traffic Routing Approaches Leveraged by Load Balancers
  - Round Robin and Weighted Round Robin
  - Least connections
  - Random
  - Hash
Caching
- Introduction
- Caching
- Types of Cache
- Updating data in cache
- Cache Eviction
Key Characteristics of Distributed Systems
Load Balancing
Caching
Data Partitioning
Indexes
Proxies
Redundancy and Replication
SQL vs. NoSQL
CAP Theorem
PACELC Theorem (New)
Consistent Hashing (New)
Long-Polling vs WebSockets vs Server-Sent Events
Bloom Filters (New)
Quorum (New)
Leader and Follower (New)
Heartbeat (New)
Checksum (New)
Designing a URL Shortening service like TinyURL
Designing Pastebin
Designing Instagram
Designing Dropbox
Designing Facebook Messenger
Designing Twitter
Designing Youtube or Netflix
Designing Typeahead/Auto Suggestion
Designing an API Rate Limiter
Designing Twitter Search
Designing a Web Crawler
Designing Facebook’s Newsfeed
Designing Yelp or Nearby Friends
Designing Uber backend
Designing Ticketmaster
Designing TikTok/Video Sharing service

Mirodil commented 3 years ago

DNS Load Balancing

What is Domain Name System (DNS) based load balancing? Load balancing is the practice of distributing traffic across more than one server to improve performance and availability. Organizations use different forms of load balancing to speed up both websites and private networks. Without load balancing, most Internet applications and websites would not handle traffic effectively or function correctly.

DNS is often referred to as the Internet's phonebook because it translates website domains (like google.com or nytimes.com) into IP addresses. An IP address is a long numerical label servers use to identify websites and any device connected to the Internet. By translating domain names to IP addresses — a process called DNS resolution — DNS saves people from memorizing long sequences of numbers to access websites and applications.

In DNS resolution, an Internet user's browser contacts a DNS server to request the destination website's correct IP address. The act of requesting an IP address from a domain is called a DNS query.

DNS-based load balancing is a specific type of load balancing that uses the DNS to distribute traffic across several servers. It does this by providing different IP addresses in response to DNS queries. Load balancers can use various methods or rules for choosing which IP address to share in response to a DNS query.

One of the most common DNS load balancing techniques is called round-robin DNS.

What is round-robin DNS? Round-robin DNS has the same goal as other types of DNS-based load balancing: improving a site's performance and reliability by distributing traffic. However, as opposed to using a specialized software-based or hardware-based load balancer, round-robin DNS performs load balancing using a type of DNS server called an authoritative nameserver.

Authoritative nameservers hold DNS records called A records or AAAA records, which contain a domain’s name and its matching IP address. When a client submits a DNS query, the query's goal is to find the A (or AAAA) record. A domain will have a single A record tied to a single IP address in a basic setup, meaning a DNS query will always return the same IP address.

However, in round-robin DNS, domains have multiple A records, each tied to a different IP address. As DNS queries come in, IP addresses rotate in a round-robin fashion, spreading the requests across the associated servers.

How does round-robin DNS work? If there are five IP addresses in round-robin DNS, a DNS query would only return IP address #1 for every sixth request. Because each IP address corresponds to a different server, this setup reduces each server's workload, making it less likely to become overwhelmed by requests.

To understand how round-robin DNS works, compare the act of visiting a website to sending a company a piece of mail. Suppose the company uses a PO box to receive customer mail, but they receive more mail than their singular PO box can handle. To help solve this problem, the company could purchase more PO boxes.

For the multiple PO box strategy to work, the company must ensure that no one box overflows with mail. That means that the PO box address that appears when customers look up the company's mailing address would need to alternate sequentially. Customer #1 would see PO box address #1; then, customer #2 would see PO box address #2. This method would help reduce the burden on the individual PO boxes by increasing overall capacity. Without the additional mailboxes, a single PO box could easily overflow, delaying incoming mail.

Like in the PO box example, round-robin DNS protects servers from becoming overwhelmed with requests, avoiding any delays in processing them.

What other types of DNS-based load balancing are there? While the round-robin approach is popular, it is not the only method for routing traffic. Most load balancers allow domain owners to choose from several traffic routing rules.

One example of a DNS-based load balancing configuration is a weighted algorithm where different servers are assigned relative weights based on their capacity to handle traffic. Traffic is then assigned proportionately. For example, if server A has twice the capacity of server B, then the load balancer would give twice the amount of traffic to server A compared to server B. It would do this by returning server A's IP address in response to DNS queries. Weighted round-robin or weighted least connection are examples of this type of load balancing algorithm.

Many of the DNS-based load balancing approaches are dynamic, meaning that the load balancers consider server health and server response times when assigning requests. Dynamic algorithms can take many forms. "Least connection" is one type of dynamic load balancing algorithm. In the least connection configuration, server monitoring determines which server currently has the fewest open connections and then assigns incoming traffic to that server by providing its IP address in response to DNS queries.

Geo-location is another widely used dynamic algorithm. In this configuration, the load balancer assigns requests from a region to a defined server or server set. For example, all requests coming from France might go to ‘server F,’ and all requests coming from Spain might go to ‘server S.’

A proximity-based algorithm accomplishes something similar. In this configuration, load balancers dynamically assign traffic to the server closest to the user.

Dynamic algorithms follow slightly different rules but ultimately do the same thing: monitor server health and optimize how traffic is assigned.

Mirodil commented 3 years ago

CDN (Content Delivery Network)

Mirodil commented 3 years ago

Top 25 questions: https://medium.com/javarevisited/25-software-design-interview-questions-to-crack-any-programming-and-technical-interviews-4b8237942db0

Mirodil / websnippet

System Design Interviews #6