Intel-bigdata / SSM

Smart Storage Management for Big Data, a comprehensive hot/cold data optimized solution
Apache License 2.0
133 stars 67 forks source link

Fix append action failure #1801

Open littlezhou opened 6 years ago

littlezhou commented 6 years ago

append -length 100 -file /src/t2

Log

Action starts at Wed Jun 06 10:40:54 CST 2018 : Read /src/t2 Append to /src/t2 java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.239.12.140:50010,DS-d2efbea0-5fb7-46fa-8b16-6c54364fe67c,DISK]], original=[DatanodeInfoWithStorage[10.239.12.140:50010,DS-d2efbea0-5fb7-46fa-8b16-6c54364fe67c,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:925) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)

qiyuangong commented 5 years ago

Will check details in local env. I guess this issue is caused by replication number greater than living datanode. For example, replication=3, while datanode number is 2.