dianping / cat

CAT 作为服务端项目基础组件,提供了 Java, C/C++, Node.js, Python, Go 等多语言客户端,已经在美团点评的基础架构中间件框架(MVC框架,RPC框架,数据库框架,缓存框架等,消息队列,配置系统等)深度集成,为美团点评各业务线提供系统丰富的性能指标、健康状况、实时告警等。
Apache License 2.0
18.69k stars 5.43k forks source link

CAT上传数据至集成Kerberos的HDFS运行24小时后报错 #2011

Open changjiasheng opened 4 years ago

changjiasheng commented 4 years ago

Describe the bug CAT 3.0上传数据至集成Kerberos的HDFS运行24小时后报错 重启 CAT 服务后上传成功 24小时后又报错,应该是证书失效后未刷新,求解决办法

1、JVM参数添加了-Djavax.security.auth.useSubjectCredsOnly=false 2、JDK是jdk1.8.0_131 替换了了jdk1.8.0_131/jre/lib/security下的local_policy.jar和US_export_policy.jar

` [06-29 09:10:55.073] [ERROR] [HdfsUploader$Uploader] Uploading file(/data/appdatas/cat/bucket/dump/20200629/07/cat-10.2.2X.X.dat) to HDFS(20200629/07/cat-10.2.2X.X.dat) failed! java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for hdfs-riskcontrol@X.COM to /10.2.2X.X:8020; Host Details : local host is: "tk226-nw-e09-dx1-7/10.2.2X.X"; destination host is: "risklog-nn-01.cars.com":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1414) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy25.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor82.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy25.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:258) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1600) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1390) at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:394) at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:390) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:390) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:334) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)

at org.unidal.cat.message.storage.clean.HdfsUploader.makeHdfsOutputStream(HdfsUploader.java:96) at org.unidal.cat.message.storage.clean.HdfsUploader.upload(HdfsUploader.java:117) at org.unidal.cat.message.storage.clean.HdfsUploader$Uploader.run(HdfsUploader.java:191) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Couldn't setup connection for hdfs-XXX@XXX.COM to /XXXX:8020 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:669) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462) at org.apache.hadoop.ipc.Client.call(Client.java:1381) ... 29 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Attempt to obtain new INITIATE credentials failed! (null))] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711) ... 32 more Caused by: GSSException: No valid credentials provided (Mechanism level: Attempt to obtain new INITIATE credentials failed! (null)) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:343) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:145) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)

at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ... 41 more Caused by: javax.security.auth.login.LoginException: Cannot read from System.in at com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:865) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:704) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) at javax.security.auth.login.LoginContext.login(LoginContext.java:587) at sun.security.jgss.GSSUtil.login(GSSUtil.java:258) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:158) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:335) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:331) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:330) ... 48 more `

bjhl-kyle commented 4 years ago

我也遇到这个问题,按照官网上添加配置和替换jdk文件的方式修改后,依然报错