Closed beomyeol closed 8 years ago
We must ensure that, per key, the parameter server applies each update atomically. We also want to allow thread-level parallelism, relatively load-balanced across the threads. How about the following design?
Partition the key-space into p
partitions. Each partition has a blocking queue (with updates and reads) and thread (there are p
queue and thread pairs). The partition is decided by hash(key) % p
. On receiving a message, the message handler computes the partition, and enqueues the message onto that queue. Each thread runs a loop on its queue.
Using a single thread per partition ensures that the key operations are done atomically (in fact, each key will have linearizability). The multi-node parameter server can build on this partitioning scheme.
@dafrista I like the idea. We should do.
Closed via #177.
dolphin-ps
uses REEF's Network Connection Service and deals with messages that are received from parameter workers by Wake event handler. In current implementation,dolphin-ps
may not apply updates for received values to its key-value store when it is used withParameterUpdater
. For example, there are two parameter workers (A and B) and a parameter server with a parameter updater that adds received values to the value in its key-value store. Suppose a valuev
associated with a keyk
is stored in the key-value store. The worker A and B pusha
andb
values associated wthk
to the parameter server, respectively. The message handler thread for messagea
retrievesv
and tries to setv+a
fork
. However, the message handler thread for messageb
also retrievesv
and setsv+b
fork
simultaneously. In this case, one of updates for the received messages is lost. We should handle this case, so that the parameter server can apply updates for both received values.