bign8 / cdn

CDN Cluster implementation
https://hub.docker.com/r/bign8/cdn/
8 stars 8 forks source link

Distributed Hash Tables #2

Open bign8 opened 7 years ago

bign8 commented 7 years ago

After https://github.com/bign8/cdn/pull/1 our servers can communicate with each other which means its time to work on getting efficient storage and lookups. As of #1 our servers ask all others if they have stored particular pieces of data; This effort should allow our servers to request a single other server that is more likely to have a particular piece of data. These are noted in the paper below in sections 3 and 4.

DHTs

DHTs are great for minimizing the number of server-to-server requests. There are many ways to implement them so there is quite a bit of flexibility here. I would define a simple hash function that we can configure later based on how large of a ring we will need. Then when we get new server updates in monitorNeighbors we can hash the servers and place them on the ring. Then in checkNeighbors we hash the path and find the neighbor responsible for processing those requests and only ask that neighbor (if it's not us).

// Hash consistently converts a string to a numeric value [0, capacity)
type Hash func(value string, capacity int) int

The simplest hashing function to implement would be and ASCII-based modular addition hashing method. This would work by summing the ASCII values of each character in the value and modding the result by capacity. I'm pretty sure there are other functions in the go standard library, so if you can find them, go ahead and use them.

We will have to do some experimentation to figure out the exact size of our ring that makes our caches as balanced as possible around the ring. I'm guessing 255 would be fine for our case, but will need to test to verify.

Sources:

bign8 commented 7 years ago

Sample implementation: https://github.com/sent-hil/consistenthash