Closed yeyhuan closed 2 months ago
check which be/cn reports the failure, and make sure the kerberos ticket cache auth works on all the be/cn nodes.
The logs indicate that the issue is occurring on node 15\37\42. Here are the exception logs and the results of running HDFS commands on node 15\37\42: [Insert the log details and command results here]
error log
2024-08-28 11:01:53.700+08:00 WARN (thrift-server-pool-20|237) [LeaderImpl.finishTask():194] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:10.235.15.37, be_port:9060, http_port:8040), task_type:CREATE, signature:15449, task_status:TStatus(status_code:RUNTIME_ERROR, error_msgs:[Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied]), report_version:17240412940000)
2024-08-28 11:01:53.701+08:00 WARN (thrift-server-pool-20|237) [LeaderImpl.finishTask():242] task type: CREATE, status_code: RUNTIME_ERROR, Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied, backendId: 10005, signature: 15449
2024-08-28 11:01:53.701+08:00 WARN (starrocks-mysql-nio-pool-27|1895429) [LocalMetastore.waitForFinished():2131] fail to create tablet: 10005: [Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied]
2024-08-28 11:01:53.701+08:00 WARN (thrift-server-pool-5|218) [LeaderImpl.finishTask():194] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:10.235.15.42, be_port:9060, http_port:8040), task_type:CREATE, signature:15453, task_status:TStatus(status_code:RUNTIME_ERROR, error_msgs:[Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied]), report_version:17240412930000)
2024-08-28 11:01:53.701+08:00 WARN (thrift-server-pool-10|227) [LeaderImpl.finishTask():194] finish task reports bad. request: TFinishTaskRequest(backend:TBackend(host:10.235.15.15, be_port:9060, http_port:8040), task_type:CREATE, signature:15451, task_status:TStatus(status_code:RUNTIME_ERROR, error_msgs:[Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied]), report_version:17240412930000)
2024-08-28 11:01:53.701+08:00 WARN (thrift-server-pool-5|218) [LeaderImpl.finishTask():228] cannot find task. type: CREATE, backendId: 10008, signature: 15453
2024-08-28 11:01:53.701+08:00 WARN (thrift-server-pool-10|227) [LeaderImpl.finishTask():228] cannot find task. type: CREATE, backendId: 10006, signature: 15451
2024-08-28 11:01:53.701+08:00 WARN (starrocks-mysql-nio-pool-27|1895429) [StmtExecutor.handleDdlStmt():1682] DDL statement (CREATE TABLE IF NOT EXISTS hdfs_test (
etl_flag TINYINT NOT NULL,
douyin_no varchar(100) NULL DEFAULT '',
start_time bigint NOT NULL DEFAULT '0'
)
PROPERTIES (
"storage_volume" = "hdfs_storage_volume",
"datacache.enable" = "true",
"datacache.partition_duration" = "1 MONTH",
"enable_async_write_back" = "false"
)) process failed.
com.starrocks.common.DdlException: fail to create tablet: 10005: [Internal error: starlet err Create hdfs root dir '/user/starrocks/908979b6-5632-4763-a40c-e9fa58bc9122/db10197/15447/15446' error: æéä¸å¤: Permission denied]
at com.starrocks.server.LocalMetastore.waitForFinished(LocalMetastore.java:2132)
at com.starrocks.server.LocalMetastore.sendCreateReplicaTasksAndWaitForFinished(LocalMetastore.java:2103)
at com.starrocks.server.LocalMetastore.buildPartitionsSequentially(LocalMetastore.java:1934)
at com.starrocks.server.LocalMetastore.buildPartitions(LocalMetastore.java:1902)
at com.starrocks.server.OlapTableFactory.createTable(OlapTableFactory.java:605)
at com.starrocks.server.LocalMetastore.createTable(LocalMetastore.java:843)
at com.starrocks.server.MetadataMgr.createTable(MetadataMgr.java:271)
at com.starrocks.qe.DDLStmtExecutor$StmtExecutorVisitor.lambda$visitCreateTableStatement$4(DDLStmtExecutor.java:250)
at com.starrocks.common.ErrorReport.wrapWithRuntimeException(ErrorReport.java:108)
at com.starrocks.qe.DDLStmtExecutor$StmtExecutorVisitor.visitCreateTableStatement(DDLStmtExecutor.java:249)
at com.starrocks.qe.DDLStmtExecutor$StmtExecutorVisitor.visitCreateTableStatement(DDLStmtExecutor.java:159)
at com.starrocks.sql.ast.CreateTableStmt.accept(CreateTableStmt.java:308)
at com.starrocks.qe.DDLStmtExecutor.execute(DDLStmtExecutor.java:145)
at com.starrocks.qe.StmtExecutor.handleDdlStmt(StmtExecutor.java:1656)
at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:680)
at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:345)
at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:539)
at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:846)
at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
I have verified that HDFS commands can be executed successfully on all BE/CN nodes. I also confirmed that the Kerberos ticket cache authentication is working as expected on all the nodes. Let me know if you need any further information or assistance
what does this node 15\37\42
? FE nodes?
IP addresses host:10.235.15.37, host:10.235.15.42, and host:10.235.15.15. These are the CN nodes
try to check the cn.out (or jni.log) on these nodes under cn/log/ directory, if there is detailed info related to this permission error. mostly could be some java call stack there.
how many cn nodes, does all cn nodes fail or just these 3 nodes?
all cn nodes
cn.out has the following exception information
hdfsOpenFile(/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/000000000000274D_0000000000000B2B.meta): FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) error:
FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/000000000000274D_0000000000000B2B.meta'java.io.FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/000000000000274D_0000000000000B2B.meta'
at org.apache.hadoop.fs.CosNFileSystem.getFileStatus(CosNFileSystem.java:617)
at org.apache.hadoop.fs.CosNFileSystem.open(CosNFileSystem.java:838)
at org.apache.hadoop.fs.CosFileSystem.open(CosFileSystem.java:268)
at com.qcloud.emr.fs.TemrfsHadoopFileSystemAdapter.open(TemrfsHadoopFileSystemAdapter.java:251)
hdfsOpenFile(/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/0000000000002752_0000000000000B2B.meta): FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) error:
FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/0000000000002752_0000000000000B2B.meta'java.io.FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10042/10057/10056/meta/0000000000002752_0000000000000B2B.meta'
at org.apache.hadoop.fs.CosNFileSystem.getFileStatus(CosNFileSystem.java:617)
at org.apache.hadoop.fs.CosNFileSystem.open(CosNFileSystem.java:838)
at org.apache.hadoop.fs.CosFileSystem.open(CosFileSystem.java:268)
at com.qcloud.emr.fs.TemrfsHadoopFileSystemAdapter.open(TemrfsHadoopFileSystemAdapter.java:251)
hdfsOpenFile(/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10009/10012/12845/meta/000000000000322F_0000000000000ADC.meta): FileSystem#open((Lorg/apache/hadoop/fs/Path;I)Lorg/apache/hadoop/fs/FSDataInputStream;) error:
FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10009/10012/12845/meta/000000000000322F_0000000000000ADC.meta'java.io.FileNotFoundException: No such file or directory '/emr-6y2ejh20/908979b6-5632-4763-a40c-e9fa58bc9122/db10009/10012/12845/meta/000000000000322F_0000000000000ADC.meta'
at org.apache.hadoop.fs.CosNFileSystem.getFileStatus(CosNFileSystem.java:617)
at org.apache.hadoop.fs.CosNFileSystem.open(CosNFileSystem.java:838)
at org.apache.hadoop.fs.CosFileSystem.open(CosFileSystem.java:268)
at com.qcloud.emr.fs.TemrfsHadoopFileSystemAdapter.open(TemrfsHadoopFileSystemAdapter.java:251)
these are not related, check if any permission denied
related errror.
No logs related to permissions were found in the cn.out file.
cn.INFO some
do you have additional hadoop core-site.xml/hdfs-site.xml configuration under cn/conf?
Already solved, it was the lack of hadoop core-site.xml/hdfs-site.xml
Steps to reproduce the behavior (Required)
Expected behavior (Required)
Create table to pass kerberos authentication
Real behavior (Required)
error log
StarRocks version (Required)
Linux Kerberos authentication is possible