helm / charts

⚠️(OBSOLETE) Curated applications for Kubernetes
Apache License 2.0
15.49k stars 16.79k forks source link

stable/hadoop] datanode can't scale to larger than 2 #24525

Closed ankit302 closed 2 years ago

ankit302 commented 3 years ago

Describe the bug when datanode scale to 2 or larger than 2,then use same pvc

Version of Helm and Kubernetes:

Which chart: hadoop

What happened: 2021-09-20 10:50:50,601 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2021-09-20 10:50:50,602 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting 2021-09-20 10:50:50,879 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Acknowledging ACTIVE Namenode during handshakeBlock pool (Datanode Uuid unassigned) service to hadoop-cluster-hadoop-hdfs-nn/10.28.2.131:9000 2021-09-20 10:50:50,882 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1) 2021-09-20 10:50:50,887 ERROR org.apache.hadoop.hdfs.server.common.Storage: Unable to acquire file lock on path /root/hdfs/datanode/in_use.lock 2021-09-20 10:50:50,889 ERROR org.apache.hadoop.hdfs.server.common.Storage: It appears that another node 55@hadoop-cluster-hadoop-hdfs-dn-0.hadoop-cluster-hadoop-hdfs-dn.dataservice.svc.cluster.local has already locked the storage directory: /root/hdfs/datanode java.nio.channels.OverlappingFileLockException at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:789) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:754) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:567) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:270) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:409) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:388) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:374) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816) at java.lang.Thread.run(Thread.java:745) 2021-09-20 10:50:50,891 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /root/hdfs/datanode. The directory is already locked 2021-09-20 10:50:50,891 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/root/hdfs/datanode/ java.io.IOException: Cannot lock storage /root/hdfs/datanode. The directory is already locked at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:759) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:567) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:270) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:409) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:388) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:374) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816) at java.lang.Thread.run(Thread.java:745) 2021-09-20 10:50:50,893 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool (Datanode Uuid unassigned) service to hadoop-cluster-hadoop-hdfs-nn/10.28.2.131:9000. Exiting. java.io.IOException: All specified directories have failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:374) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816) at java.lang.Thread.run(Thread.java:745) 2021-09-20 10:50:50,894 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool (Datanode Uuid unassigned) service to hadoop-cluster-hadoop-hdfs-nn/10.28.2.131:9000 2021-09-20 10:50:50,995 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool (Datanode Uuid unassigned) 2021-09-20 10:50:52,996 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2021-09-20 10:50:52,999 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

What you expected to happen: datanode scale to 2 or larger than 2

How to reproduce it (as minimally and precisely as possible): vim values.yaml dataNode: replicas: 2

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

bridgetkromhout commented 2 years ago

Hi @ankit302 - you're asking this question in a deprecated repo for helm charts, and we cannot provide support. You may find more info at https://stackoverflow.com/questions/tagged/hdfs.