Closed yatharthranjan closed 5 years ago
Do these additional directories actually count as replicas? For the name node, data is duplicated across the listed directories by simply writing all data twice. This increases the redundancy of the system, but I cannot see in the documentation that it increases the number of replicas. That would mean that using this technique, the amount of data that needs to be stored is doubled. In other projects we have just increased the number of data nodes to spread the data over more volumes.
These will not mean more replicas. This is just for increasing the storage capacity of existing data nodes(replicas). We just did this in production as our disks were full so we added new disks and hence had to add additional storage path to data nodes.
That would mean that using this technique, the amount of data that needs to be stored is doubled
Not Sure i understand this but this is just for adding a new volume as a storage to datanodes
I think the data nodes and name nodes treat volume addition differently.
Allows for extending the capacity of hadoop