LZDQ / notes

0 stars 0 forks source link

Operating Systems & Distributed Systems #7

Open LZDQ opened 4 days ago

LZDQ commented 4 days ago

L05

Time

Wall clock

Cristian accuracy: RTT/2 – assumption: RTT/2 $\simeq$ , but packets could be queued

NTP Uses UDP to avoid handshake 1 RTT record four timestamps:

offset = $\dfrac{((t2-t1) + (t3-t4))}2$ client time += offset

Lamport time

image

with total order: multiply the ID by , then add by

vector time

image
LZDQ commented 4 days ago

L06

image
Algorithm Type Message Complexity Advantages Disadvantages
Centralized Mutex Centralized $3$ - Simple implementation
- Low message overhead
- Single point of failure
- Coordinator bottleneck
Bully Algorithm Centralized $O(n^2)$ - Handles coordinator failure
- Selects highest priority node
- High message overhead
- Assumes no message loss
Fully Decentralized Decentralized $2m + m$ - No central coordinator
- Ensures majority vote consistency
- High message count
- Vulnerable to node failures
Lamport Mutual Exclusion Decentralized $3(n-1)$ - Simple logical clocks
- No single point of failure
- High message overhead
- Message reordering affects accuracy
Ricart & Agrawala Decentralized $2(n-1)$ - Reduces messages compared to Lamport
- Handles fairness well
- Still depends on logical clock accuracy
- Unbounded wait time
Token Ring Algorithm Token-based $O(1)$ - Low message complexity
- Fairness guaranteed
- Token loss requires regeneration
- Requires logical ring

Comparisons:

LZDQ commented 4 days ago

L07

Fail-Stop: no malicious machines but crashes Byzantine: malicious machines

image

Replication Models

  1. Primary-Backup

    • A primary node handles writes and forwards updates to backups.
    • Sync: Backups must acknowledge before the client receives confirmation.
    • Async: Updates sent later, risking stale reads.
  2. Quorum-Based Replication

    • Defines a set of N replicas.
    • Write quorum (W): Updates must be acknowledged by at least W replicas.
    • Read quorum (R): Reads must query at least R replicas.
    • Ensures consistency as W+R>N.
  3. Replicated State Machine (RSM)

    • All nodes execute the same commands in the same order, ensuring consistent state.

Distributed Consensus

  1. Impossibility of Consensus
    • Two Generals Problem: Messages may be lost; no deterministic consensus.
    • FLP Theorem: Consensus is impossible in asynchronous systems with failures, even we have reliable message delivery.
  2. Consensus Properties
    • Termination: All correct processes eventually decide.
    • Agreement: All correct processes agree on the same value.
    • Integrity: Decided values are valid.

Paxos consensus protocol

async: packets could be delayed/lost

Paxos is guaranteed safe. Consensus is a stable property: once reached it is never violated; the agreed value is not changed.

Paxos is not guaranteed live. Consensus is reached if “a large enough subnetwork...is non-faulty for a long enough time.” Otherwise Paxos might never terminate.

In CAP setting, Paxos choose C+P, but no A.

image

Proposer suggests values Acceptor receives values but doesn't suggest any TODO: learn this