Open Atahualkpa opened 6 years ago
@Atahualkpa Hi!
Which docker-compose are you using? Or what is your setup? Do you persist the data to the local drive from your docker containers? Eg by having volumes key.
services:
namenode:
volumes:
- /path/to/the/folder:/hadoop/dfs/name
datanode:
volumes:
- /path/to/the/folder:/hadoop/dfs/data
Hi @earthquakesan thanks for your answer, I have this setup for docker compose:
version: '3'
services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
networks:
- workbench
volumes:
- namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop.env
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: on-failure
labels:
traefik.docker.network: workbench
traefik.port: 50070
ports:
- 8334:50070
volumes:
- /data0/reference/hg19-ucsc/:/reference/hg19-ucsc/
- /data0/output/:/output/
- /data/ngs/:/ngs/
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
networks:
- workbench
volumes:
- datanode:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:50070"
env_file:
- ./hadoop.env
deploy:
mode: global
restart_policy:
condition: on-failure
labels:
traefik.docker.network: workbench
traefik.port: 50075
volumes:
datanode:
namenode:
networks:
workbench:
external: true
but I observe I have not set a path for hdfs. I try to set a local path but the problem is still present. i checked the path and I find this file called VERSION into a directory named current. This is written into file:
storageID=DS-6e863e5f-34a1-4d09-bcf2-58f6badc7dba clusterID=CID-4a2c4782-785b-4b8c-be8f-e0d7cef85b24 cTime=0 datanodeUuid=48dc924c-fea1-40d8-9da2-7faeb2ee28b9 storageType=DATA_NODE layoutVersion=-56
also checking the directory i fount this folder BP-1651631011-10.0.0.12-1527073017748/current and into this folder is present another file called VERSION but in this is written this:
namespaceID=1025220048 cTime=0 blockpoolID=BP-1651631011-10.0.0.12-1527073017748 layoutVersion=-56
this is the exception generated
namenode clusterID = CID-37f14517-46c8-430a-803d-5fe2b0d047fc; datanode clusterID = CID-4a2c4782-785b-4b8c-be8f-e0d7cef85b24
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:777)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:300)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:416)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:395)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:573)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1386)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1351)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:313)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:637)
at java.lang.Thread.run(Thread.java:748)
Thanks for your support.
@Atahualkpa How many nodes do you have in your swarm cluster? Do the containers always allocated on the same nodes?
Now I have three nodes into the swarm into the leader are running 6 containers their are:
and into the others are running
Do the containers always allocated on the same nodes? Yes, but I must first start the leader because in this are presents files I puting into HDFS, if I join the another nodes the master spark and the name node is selected random.
moreover anytime I deploy the swarm was are present this hadoop_volume into any node the swarm.
thanks.
incompatible clusterID Hadoop
Hi, anytime I rebooted the swarm I have this problem
I solved this problem deleting this docker volume
[ { "CreatedAt": "2018-05-10T19:35:31Z", "Driver": "local", "Labels": { "com.docker.stack.namespace": "hadoop" }, "Mountpoint": "/data0/docker_var/volumes/hadoop_datanode/_data", "Name": "hadoop_datanode", "Options": {}, "Scope": "local" } ]
but in this volume are present the files which I put in hdfs, so in this way I have to to put again the files into hdfs when I deploy the swarm. I'm not sure this is the right way to solve this problem. Googling I found one solution but I dont know how to applicate it before the swarm reboot, this is the solution: The problem is with the property name dfs.datanode.data.dir, it is misspelt as dfs.dataode.data.dir. This invalidates the property from being recognised and as a result, the default location of ${hadoop.tmp.dir}/hadoop-${USER}/dfs/data is used as data directory. hadoop.tmp.dir is /tmp by default, on every reboot the contents of this directory will be deleted and forces datanode to recreate the folder on startup. And thus Incompatible clusterIDs. Edit this property name in hdfs-site.xml before formatting the namenode and starting the services.thanks.