linkedin / dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
BSD 2-Clause "Simplified" License
131 stars 34 forks source link

Dynamometer does not support negative block id's #63

Open fangyurao opened 6 years ago

fangyurao commented 6 years ago

Dynamometer does NOT support negative block id's, which results in blocks with negative id's never being reported by any simulated DataNode.

A change has been made to XMLParser.java in our branch of Dynamometer so that negative block id's are also dealt with.

Due to the change made above, we have to change SimulatedMultiStorageFSDataset.java as well.

In Dynamometer, each DataNode has more than 1 SimulatedStorage to manage, and this following Map is maintained by each SimulatedStorage in a simulated DataNode. Moreover, a SimulatedStorage could be involved in multiple blockpools.

Map<blockpool id, Map<block, block information>>

To access a given block (associated with a blockpool id) on a simulated DataNode, we have to ( i) determine which SimulatedStorage this given block belongs to according to its block id, and then (ii) use the associated blockpool id to retrive Map<block, block information> corresponding to the block to be accessed.

The SimulatedStorage's managed by a DataNode are arranged on an ArrayList and each SimulatedStorage on the ArrayList could be accessed by a "non-negative" integer upper-bounded by the size of that ArrayList, exclusive. To determine the SimulatedStorage a given block belongs to, the original Dynamometer simply uses (block id % number of simulated storages) as the index to access the ArrayList mentioned above. Hence, once we have a negative block id, an ArrayIndexOutOfBoundsException will be triggered. Some changes have been made in SimulatedMultiStorageFSDataset.java so that a negative block id is properly taken care of.

jojochuang commented 5 years ago

Will submit a PR on behalf of @fangyurao