Open nichhk opened 7 years ago
@kyler-m: choose a random subset of valid DataNodes.
Implemented find_datanode_for_block to use free_bytes and xmits in: 54f8df373fd31c92a97ca3dfc954cad4537e864c
Live DNs which do not already have a replica of the requested block and have enough free space to hold the block are pushed onto a priority queue based on num transmits. Then, the number of reqeusted datanodes (replication_factor many) are popped from the priority queue and returned to the caller.
Some clean up work remains.
You can see in the image that we find two DNs that are:
The first one we see has 5 transmits, the next one has 3 transmits. Since we minimize transmits, the one with 3 transmits is selected and returned.
The NameNode needs to choose target DataNodes for a block using 3 criteria: