linkedin / dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
BSD 2-Clause "Simplified" License
131 stars 34 forks source link

Hadoop 3.0 NN drops replicas silently #64

Open jojochuang opened 5 years ago

jojochuang commented 5 years ago

In Hadoop 3.0/CDH5.7 and above, HDFS-9260 (Improve the performance and GC friendliness of NameNode startup and full block reports) changed the internal representation of block replicas, as well as the block report processing logic in NameNode.

After HDFS-9260, NN expects block replicas to be reported in ascending order of block id. If a block id is not in order, NN discards it silently. Because simulated DataNode in Dynamometer uses hash map to store block replicas, the replicas are not reported in order. The Dynamometer cluster would then see missing blocks gradually increase several minutes after NN starts.

Suggest to change SimulatedBPStorage.blockMap to a TreeMap sorted by block id. Will supply a patch for the proposed change.

Credit: @fangyurao for identifying the issue, and help verifying the fix.