issues
search
kymr
/
daily-study
2
stars
0
forks
source link
Week2 - Distributing Content to Open Connect
#5
Closed
kymr
closed
6 years ago
kymr
commented
6 years ago
Title
Distributing Content to Open Connect
Summary
Goal
Maximizing traffic without compromising utilization
To ensure stable operation even in changing cluster
Consistent Hashing
There is a ring with all numbers from 0 to N
Server IDs are hashed over that ring. (1000 times to distribute more equally)
The space between hash values is owned by that server Id.
Content Ids are hashed over that ring too, and it is assigned to the server which own that space.
This approach also can minimize churn, when a server is added to or removed from the cluster.
Heterogeneouty is challenge
They have two general categories. Storage and Flash.
Storage size is different across servers in the cluster
Throughput is also different.
These issues can cause contents holes and decreasing throughput utilization
Heterogeneous Allocation Algorithm
Criteria
Distribute content in proportion to the storage capacity of each server without causing content holes
Distribute popular and less popular content so that traffic attracted to a server is proportional to its throughput capacity
Allocating contents in two stages, each with its own weighted consistent hash ring
Specifying weights for each server in each stage
Catalog depth (cutoff) where switch from stage1 into stage2.
Choosing the cutoff where the probability of churn is smallest
Reference
https://medium.com/netflix-techblog/distributing-content-to-open-connect-3e3e391d4dc9
kymr
commented
6 years ago
Words
we’ve talked about
we will dig deeper into
also referred to as Open Connect Appliances or OCAs in other literature
heterogeneous
This work is a result of
Content Placement Goals
refers to the decisions we make on a daily basis
which content gets deployed to
in a given cluster.
It also makes sense to
day-over-day
distribute content across multiple servers
are hashed over this ring
precedes
to generate a reasonably equal distribution of content
to facilitate fair re-hashing when the cluster changes.
Using the Uniform Consistent Hashing approach,
churn is minimized
1000 new slices are distributed over the ring
where the new server takes over content roughly equally from the other servers
it passes on the content ownership roughly equally to the rest of the servers
can be sub-optimal
Our servers fall into one of two general categories
have very different characteristics
can hold upwards of 200 TB
generate ~40 Gbps of throughput
can hold only up to 18 TB
For small to medium ISP co-locations
with ever-increasing capability
to serve alongside older ones
without compromising on
and this leads to disk space differences even among servers with the same hardware type.
have different levels of
therefore would create a gap in stored popular content
because the traffic attracted to a server would generally be proportional to storage size
The solution to these issues is
to make better use of hardware resources.
by altering the allocation protocol
use a model to come up with
We have two criteria that need to be satisfied:
could satisfy one or the other constraint, but not both
either yields a set of allocation weights satisfying both criteria above
determines that cutoff D is infeasible (no configuration satisfies the constraints).
While it is possible that
we find in practice
induces the least amount of churn
it would be allocated in different rings on consecutive days
where the probability of its shuffling is smallest.
To mitigate this,
has shown a clear benefit
Title
Summary
Reference