apache / linkis

Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
https://linkis.apache.org/
Apache License 2.0
3.27k stars 1.15k forks source link

[Question] 1.5.0部署过程中,linkis-cg-linkismanager启动失败。 #5068

Closed tigerHM closed 5 months ago

tigerHM commented 5 months ago

Before asking

Your environment

Describe your questions

LINKIS-CG-ENTRANCE、LINKIS-MG-EUREKA、LINKIS-MG-GATEWAY、LINKIS-PS-PUBLICSERVIC启动正常,linkis-cg-linkismanager启动失败。

linkis-cg-linkismanager.log日志如下: 2024-01-12 11:06:02.109 [INFO ] [main ] o.a.l.DataWorkCloudApplication (61) [logStarted] [JobId-] - Started DataWorkCloudApplication in 17.221 seconds (JVM running for 21.334) 2024-01-12 11:06:04.654 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (126) [run] [JobId-] - Try to initialize sparkEngineConn-3.2.1. 2024-01-12 11:06:04.675 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (225) [refresh] [JobId-] - Ready to upload a new bmlResource for sparkEngineConn-3.2.1. path: conf.zip 2024-01-12 11:06:08.811 [INFO ] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.h.d.DWSHttpClient (150) [addAttempt$1] [JobId-] - invoke http://xxx:9001/api/rest_j/v1/bml/upload get status 400 taken: 4.0 s. 2024-01-12 11:06:09.083 [ERROR] [Linkis-Default-Scheduler-Thread-1 ] o.a.l.e.s.s.DefaultEngineConnResourceService (135) [run] [JobId-] - Failed to upload engine conn to bml, now exit! org.apache.linkis.httpclient.exception.HttpClientResultException: errCode: 10905 ,desc: URL /api/rest_j/v1/bml/upload request failed! ResponseBody is {"method":null,"status":1,"message":"error code(错误码): 60050, error message(错误信息): The first upload of the resource failed(首次上传资源失败).","data":{"errorMsg":{"serviceKind":"linkis-ps-publicservice","port":9105,"level":2,"errCode":50073,"ip":"xxxx","desc":"The commit upload resource task failed(提交上传资源任务失败):errCode: 60050 ,desc: The first upload of the resource failed(首次上传资源失败) ,ip: xxx ,port: 9105 ,serviceKind: linkis-ps-publicservice"}}}. errCode: 10905 ,desc: URL /api/rest_j/v1/bml/upload request failed! ResponseBody is {"method":null,"status":1,"message":"error code(错误码): 60050, error message(错误信息): The first upload of the resource failed(首次上传资源失败).","data":{"errorMsg":{"serviceKind":"linkis-ps-publicservice","port":9105,"level":2,"errCode":50073,"ip":"xxx","desc":"The commit upload resource task failed(提交上传资源任务失败):errCode: 60050 ,desc: The first upload of the resource failed(首次上传资源失败) ,ip: xxx ,port: 9105 ,serviceKind: linkis-ps-publicservice"}}}. ,ip: xxx ,port: 9101 ,serviceKind: linkis-cg-linkismanager ,ip: xxx ,port: 9101 ,serviceKind: linkis-cg-linkismanager at org.apache.linkis.httpclient.dws.response.DWSResult.$anonfun$set$2(DWSResult.scala:86) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.dws.response.DWSResult.$anonfun$set$2$adapted(DWSResult.scala:84) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at org.apache.linkis.common.utils.Utils$.tryCatch(Utils.scala:69) ~[linkis-common-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.dws.response.DWSResult.set(DWSResult.scala:84) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.dws.response.DWSResult.set$(DWSResult.scala:57) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at org.apache.linkis.bml.response.BmlResult.set(BmlResult.scala:26) ~[linkis-pes-client-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.dws.DWSHttpClient.$anonfun$httpResponseToResult$1(DWSHttpClient.scala:83) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at scala.Option.map(Option.scala:230) ~[scala-library-2.12.17.jar:?] at org.apache.linkis.httpclient.dws.DWSHttpClient.httpResponseToResult(DWSHttpClient.scala:79) ~[linkis-gateway-httpclient-support-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.AbstractHttpClient.$anonfun$responseToResult$1(AbstractHttpClient.scala:546) ~[linkis-httpclient-1.5.0.jar:1.5.0] at org.apache.linkis.common.utils.Utils$.tryFinally(Utils.scala:77) ~[linkis-common-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.AbstractHttpClient.responseToResult(AbstractHttpClient.scala:559) ~[linkis-httpclient-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.AbstractHttpClient.execute(AbstractHttpClient.scala:183) ~[linkis-httpclient-1.5.0.jar:1.5.0] at org.apache.linkis.httpclient.AbstractHttpClient.execute(AbstractHttpClient.scala:128) ~[linkis-httpclient-1.5.0.jar:1.5.0] at org.apache.linkis.bml.client.impl.HttpBmlClient.uploadResource(HttpBmlClient.scala:412) ~[linkis-pes-client-1.5.0.jar:1.5.0] at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.uploadToBml(DefaultEngineConnResourceService.java:82) ~[linkis-application-manager-1.5.0.jar:1.5.0] at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.refresh(DefaultEngineConnResourceService.java:230) ~[linkis-application-manager-1.5.0.jar:1.5.0] at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService.access$300(DefaultEngineConnResourceService.java:60) ~[linkis-application-manager-1.5.0.jar:1.5.0] at org.apache.linkis.engineplugin.server.service.DefaultEngineConnResourceService$1.run(DefaultEngineConnResourceService.java:128) ~[linkis-application-manager-1.5.0.jar:1.5.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_321] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_321] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_321] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_321] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_321] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_321] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_321]

2024-01-12 11:06:09.106 [INFO ] [SpringContextShutdownHook ] o.s.c.n.e.s.EurekaServiceRegistry (65) [deregister] [JobId-] - Unregistering application LINKIS-CG-LINKISMANAGER with eureka with status DOWN 2024-01-12 11:06:09.107 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (1352) [notify] [JobId-] - Saw local status change event StatusChangeEvent [timestamp=1705028769107, current=DOWN, previous=UP] 2024-01-12 11:06:09.109 [INFO ] [DiscoveryClient-InstanceInfoReplicator-0] c.n.d.DiscoveryClient (873) [register] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101: registering service... 2024-01-12 11:06:09.126 [INFO ] [DiscoveryClient-InstanceInfoReplicator-0] c.n.d.DiscoveryClient (882) [register] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101 - registration status: 204 2024-01-12 11:06:09.138 [INFO ] [SpringContextShutdownHook ] o.e.j.s.AbstractConnector (383) [doStop] [JobId-] - Stopped ServerConnector@6516181f{HTTP/1.1, (http/1.1)}{0.0.0.0:9101} 2024-01-12 11:06:09.138 [INFO ] [SpringContextShutdownHook ] o.e.j.s.session (149) [stopScavenging] [JobId-] - node0 Stopped scavenging 2024-01-12 11:06:09.143 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.C.application (2368) [log] [JobId-] - Destroying Spring FrameworkServlet 'dispatcherServlet' 2024-01-12 11:06:09.144 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.C.application (2368) [log] [JobId-] - Destroying Spring FrameworkServlet 'springrestful' 2024-01-12 11:06:09.149 [INFO ] [SpringContextShutdownHook ] o.e.j.s.h.ContextHandler (1159) [doStop] [JobId-] - Stopped o.s.b.w.e.j.JettyEmbeddedWebAppContext@3f6a9ba0{application,/,[file:///tmp/jetty-docbase.9101.9187708320195614229/, jar:file:/opt/bigdata/apache-linkis-1.5.0/lib/linkis-commons/public-module/knife4j-spring-ui-2.0.9.jar!/META-INF/resources],STOPPED} 2024-01-12 11:06:09.165 [WARN ] [SpringContextShutdownHook ] o.e.j.u.t.QueuedThreadPool (299) [doStop] [JobId-] - Stopped without executing or closing null 2024-01-12 11:06:09.181 [INFO ] [SpringContextShutdownHook ] o.s.s.c.ThreadPoolTaskExecutor (218) [shutdown] [JobId-] - Shutting down ExecutorService 'applicationTaskExecutor' 2024-01-12 11:06:09.233 [INFO ] [SpringContextShutdownHook ] c.a.d.p.DruidDataSource (2138) [close] [JobId-] - {dataSource-1} closing ... 2024-01-12 11:06:09.243 [INFO ] [SpringContextShutdownHook ] c.a.d.p.DruidDataSource (2211) [close] [JobId-] - {dataSource-1} closed 2024-01-12 11:06:09.255 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (935) [shutdown] [JobId-] - Shutting down DiscoveryClient ... 2024-01-12 11:06:12.262 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (971) [unregister] [JobId-] - Unregistering ... 2024-01-12 11:06:12.285 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (973) [unregister] [JobId-] - DiscoveryClient_LINKIS-CG-LINKISMANAGER/xxx:linkis-cg-linkismanager:9101 - deregister status: 200 2024-01-12 11:06:12.305 [INFO ] [SpringContextShutdownHook ] c.n.d.DiscoveryClient (960) [shutdown] [JobId-] - Completed shut down of DiscoveryClient

github-actions[bot] commented 5 months ago

:blush: Welcome to the Apache Linkis community!!

We are glad that you are contributing by opening this issue.

Please make sure to include all the relevant context. We will be here shortly.

If you are interested in contributing to our website project, please let us know! You can check out our contributing guide on :point_right: How to Participate in Project Contribution.

Community

WeChat Assistant WeChat Public Account

Mailing Lists

Name Description Subscribe Unsubscribe Archive
dev@linkis.apache.org community activity information subscribe unsubscribe archive
tigerHM commented 5 months ago

在LINKIS-PS-PUBLICSERVIC日志里面找到以下信息: Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: hadoop from keytab /etc/security/keytabs/xxx.keytab/hadoop.keytab javax.security.auth.login.LoginException: Unable to obtain password from user

根据日志看,对比配置里面HADOOP_KEYTAB_PATH配置的是/etc/security/keytabs/xxx.keytab,实际运行时获取的是/etc/security/keytabs/xxx.keytab/{username}.keytab, 所以HADOOP_KEYTAB_PATH实际只需要配置keytab文件的目录。