cerndb / hdfs-metadata

Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.
GNU General Public License v3.0
56 stars 19 forks source link

Some nodes' data gets bucketed into "Unknown" disk #3

Open Tagar opened 6 years ago

Tagar commented 6 years ago
 === Distribution across nodes and disks ===

DiskId                   0 0 0 0 0 0 0 0 0 0 1 1
                         0 1 2 3 4 5 6 7 8 9 0 1 Unknown   Count     Average
Host
pc1udatahad15.abacus-u...= = - + = - = = = =     0         4990      499
pc1udatahad12.abacus-u...= = = = = = = = = =     0         6000      600
pc1udatahad07.abacus-u...= = + = = - + - = =     0         3597      359
pc1udatahad16.abacus-u...0 0 0 0 0 0 0 0 0 0     19489     19489     0
pc1udatahad11.abacus-u...= = = = = = = = = =     0         6919      691
pc1udatahad09.abacus-u...0 0 0 0 0 0 0 0 0 0 0 0 9131      9131      0
pc1udatahad17.abacus-u...0 0 0 0 0 0 0 0 0 0     6529      6529      0
pc1udatahad10.abacus-u...= = = + = = = - - = = = 0         8337      694
pc1udatahad13.abacus-u...= = = = = = = = = =     0         5947      594
pc1udatahad14.abacus-u...= = = = = = = = = =     0         6424      642
pc1udatahad08.abacus-u...+ - + - - - = + = +     0         2637      263

Notice nodes 16, 09, 17 show zeros in all disks and actual data gets dumped into "Unknown" column.. anyone knows why that is?

dlanza1 commented 6 years ago

How you tried restarting them? are they running the same version of HDFS?

Tagar commented 6 years ago

Yes, same version. We run Cloudera distribution of Hadoop, deployed through parcels so they're 100% same version. They work correctly, no issues. This is part of a production cluster, we know they work correctly and can't restart them easily - it has to be scheduled. Thanks.