Enable multiple path for hdfs storage and instructions

RADAR-base / RADAR-Docker

Integrated Docker Stack for the RADAR mHealth Streaming Platform Components

https://hub.docker.com/u/radarbase/dashboard/

Apache License 2.0

16 stars 16 forks source link

Enable multiple path for hdfs storage and instructions #189

Closed yatharthranjan closed 5 years ago

yatharthranjan commented 5 years ago

Allows for extending the capacity of hadoop

blootsvoets commented 5 years ago

Do these additional directories actually count as replicas? For the name node, data is duplicated across the listed directories by simply writing all data twice. This increases the redundancy of the system, but I cannot see in the documentation that it increases the number of replicas. That would mean that using this technique, the amount of data that needs to be stored is doubled. In other projects we have just increased the number of data nodes to spread the data over more volumes.

yatharthranjan commented 5 years ago

These will not mean more replicas. This is just for increasing the storage capacity of existing data nodes(replicas). We just did this in production as our disks were full so we added new disks and hence had to add additional storage path to data nodes.

That would mean that using this technique, the amount of data that needs to be stored is doubled

Not Sure i understand this but this is just for adding a new volume as a storage to datanodes

afolarin commented 5 years ago

I think the data nodes and name nodes treat volume addition differently.