This feature runs a job on the cluster on the tags specified to find the unreplicated size of the data on the filesystem. It may not be 100% accurate if there is a missing replica, but it will try and avoid downloading from another node (and thus not being able to find the size locally).
Usage: ddfs du [-H/-P/-n]
For larger tags you will want to increase the partitions and number of cores available to the job (-P and -n respectively).
If you would like human readable output (or just hate doing math) you can use -H and it will output similar to the following:
$ ddfs du chekov -H
chekov: 7.82 MB
This will cause extra load on the cluster, and large tags might take a while to come back with a result.
This feature runs a job on the cluster on the tags specified to find the unreplicated size of the data on the filesystem. It may not be 100% accurate if there is a missing replica, but it will try and avoid downloading from another node (and thus not being able to find the size locally).
Usage: ddfs du [-H/-P/-n]
For larger tags you will want to increase the partitions and number of cores available to the job (-P and -n respectively). If you would like human readable output (or just hate doing math) you can use -H and it will output similar to the following:
$ ddfs du chekov -H chekov: 7.82 MB
This will cause extra load on the cluster, and large tags might take a while to come back with a result.