linkedin / dynamometer

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
BSD 2-Clause "Simplified" License
131 stars 34 forks source link

Continue requesting block reports until all DataNodes have reported #53

Closed xkrogen closed 6 years ago

xkrogen commented 6 years ago

Currently within waitForNameNodeReadiness, while waiting for the NameNode to have received enough block reports to be ready for use, the AppMaster will poll the NameNode to discover which DataNodes haven't sent block reports yet and trigger reports on those DataNodes. This can help when a DataNode sent its initial block report before all of its blocks were injected, in which case a better report wouldn't be sent until the block report interval expired (which can be very long). Right now it stops as soon as the block thresholds are met, but it would be better if it continued to do this even after the thresholds are met, until all DataNodes have actually reported.