apache / dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
https://dolphinscheduler.apache.org/
Apache License 2.0
12.78k stars 4.59k forks source link

yarn applications: application_1634958933716_0113 , query status failed #6605

Closed KingSpring closed 2 years ago

KingSpring commented 2 years ago

Search before asking

What happened

operate

when I run a shell for testing mapreduce in ds (fee image below),ds web log shows : yarn status get failed. shell content : hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10 360截图17520712172535

ds web log

[INFO] 2021-10-26 10:34:28.745 - [taskAppId=TASK-1-6-89]:[115] - create dir success /exec/process/1/1/6/89 [INFO] 2021-10-26 10:34:28.754 - [taskAppId=TASK-1-6-89]:[88] - shell task params {"rawScript":"hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10","localParams":[],"resourceList":[]} [INFO] 2021-10-26 10:34:28.758 - [taskAppId=TASK-1-6-89]:[154] - raw script : hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10 [INFO] 2021-10-26 10:34:28.759 - [taskAppId=TASK-1-6-89]:[155] - task execute path : /exec/process/1/1/6/89 [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[87] - tenantCode user:root, task dir:1_6_89 [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[92] - create command file:/exec/process/1/1/6/89/1_6_89.command [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[111] - command : #!/bin/sh BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source /opt/app/dolphinscheduler/conf/env/dolphinscheduler_env.sh /exec/process/1/1/6/89/1_6_89_node.sh [INFO] 2021-10-26 10:34:28.764 - [taskAppId=TASK-1-6-89]:[330] - task run command: sudo -u root sh /exec/process/1/1/6/89/1_6_89.command [INFO] 2021-10-26 10:34:28.773 - [taskAppId=TASK-1-6-89]:[211] - process start, process id is: 19627 [INFO] 2021-10-26 10:34:29.774 - [taskAppId=TASK-1-6-89]:[138] - -> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/app/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/app/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Number of Maps = 10 Samples per Map = 10 [INFO] 2021-10-26 10:34:31.775 - [taskAppId=TASK-1-6-89]:[138] - -> Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 21/10/26 10:34:31 INFO client.RMProxy: Connecting to ResourceManager at hadoop47/192.168.80.47:8032 [INFO] 2021-10-26 10:34:32.776 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:32 INFO input.FileInputFormat: Total input files to process : 10 21/10/26 10:34:32 INFO mapreduce.JobSubmitter: number of splits:10 21/10/26 10:34:32 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 21/10/26 10:34:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1634958933716_0113 21/10/26 10:34:32 INFO impl.YarnClientImpl: Submitted application application_1634958933716_0113 21/10/26 10:34:32 INFO mapreduce.Job: The url to track the job: http://hadoop47:8088/proxy/application_1634958933716_0113/ 21/10/26 10:34:32 INFO mapreduce.Job: Running job: job_1634958933716_0113 [INFO] 2021-10-26 10:34:40.785 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:39 INFO mapreduce.Job: Job job_1634958933716_0113 running in uber mode : false 21/10/26 10:34:39 INFO mapreduce.Job: map 0% reduce 0% [INFO] 2021-10-26 10:34:56.789 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:56 INFO mapreduce.Job: map 30% reduce 0% [INFO] 2021-10-26 10:34:57.790 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:57 INFO mapreduce.Job: map 100% reduce 0% [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[445] - find app id: application_1634958933716_0113 [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[402] - check yarn application status, appId:application_1634958933716_0113 [ERROR] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[418] - yarn applications: application_1634958933716_0113 , query status failed, exception:{} java.lang.NullPointerException: null at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [INFO] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[238] - process has exited, execute path:/exec/process/1/1/6/89, processId:19627 ,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0 [INFO] 2021-10-26 10:35:02.791 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:35:02 INFO mapreduce.Job: map 100% reduce 100% 21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 completed successfully 21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=2205654 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2630 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=149819 Total time spent by all reduces in occupied slots (ms)=3113 Total time spent by all map tasks (ms)=149819 Total time spent by all reduce tasks (ms)=3113 Total vcore-milliseconds taken by all map tasks=149819 Total vcore-milliseconds taken by all reduce tasks=3113 Total megabyte-milliseconds taken by all map tasks=153414656 Total megabyte-milliseconds taken by all reduce tasks=3187712 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1450 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=6825 CPU time spent (ms)=4980 Physical memory (bytes) snapshot=3529900032 Virtual memory (bytes) snapshot=22377988096 Total committed heap usage (bytes)=2413297664 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 30.695 seconds Estimated value of Pi is 3.20000000000000000000

worker debug log

[DEBUG] 2021-10-26 10:34:56.708 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400c9, packet:: clientPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 serverPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 finished:false header:: 2933,4 replyHeader:: 2933,17180717039,0 request:: '/dolphinscheduler/nodes/worker/default/192.168.80.49:1234,T response:: #302e332c302e39312c302e35392c312e33372c382e302c302e332c323032312d31302d32362030393a32373a30362c323032312d31302d32362031303a33343a35362c302c34303937,s{17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701} [DEBUG] 2021-10-26 10:34:56.708 org.apache.dolphinscheduler.service.zk.ZookeeperCachedOperator:[62] - zookeeperListener:org.apache.dolphinscheduler.server.master.registry.ServerNodeManager$WorkerGroupNodeListener triggered [DEBUG] 2021-10-26 10:34:56.709 org.apache.curator.framework.recipes.cache.TreeCache:[396] - processResult: CuratorEventImpl{type=GET_DATA, resultCode=0, path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234', name='null', children=null, context=null, stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701 , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55], watchedEvent=null, aclList=null, opResults=null} [DEBUG] 2021-10-26 10:34:56.709 org.apache.curator.framework.recipes.cache.TreeCache:[857] - publishEvent: TreeCacheEvent{type=NODE_UPDATED, data=ChildData{path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234', stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701 , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55]}} [INFO] 2021-10-26 10:34:56.789 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:56 INFO mapreduce.Job: map 30% reduce 0% [INFO] 2021-10-26 10:34:57.790 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:57 INFO mapreduce.Job: map 100% reduce 0% [DEBUG] 2021-10-26 10:34:58.313 org.apache.zookeeper.ClientCnxn:[745] - Got ping response for sessionid: 0x30015c0a38d009d after 0ms [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[445] - find app id: application_1634958933716_0113 [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[402] - check yarn application status, appId:application_1634958933716_0113 [DEBUG] 2021-10-26 10:35:02.715 org.apache.dolphinscheduler.common.utils.HadoopUtils:[211] - yarn application url:http://hadoop47:%s/ws/v1/cluster/apps/%s, applicationId:application_1634958933716_0113 [ERROR] 2021-10-26 10:35:02.720 org.apache.dolphinscheduler.common.utils.HttpUtils:[73] - Connect to hadoop47:80 [hadoop47/192.168.80.47] failed: Connection refused (Connection refused) org.apache.http.conn.HttpHostConnectException: Connect to hadoop47:80 [hadoop47/192.168.80.47] failed: Connection refused (Connection refused) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at org.apache.dolphinscheduler.common.utils.HttpUtils.get(HttpUtils.java:60) at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:420) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) at java.net.Socket.connect(Socket.java:606) at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134) ... 20 common frames omitted [ERROR] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[418] - yarn applications: application_1634958933716_0113 , query status failed, exception:{} java.lang.NullPointerException: null at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [INFO] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[238] - process has exited, execute path:/exec/process/1/1/6/89, processId:19627 ,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0 [INFO] 2021-10-26 10:35:02.720 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[147] - task instance id : 89,task final status : FAILURE [INFO] 2021-10-26 10:35:02.721 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[185] - develop mode is: false [INFO] 2021-10-26 10:35:02.721 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[203] - exec local path: /exec/process/1/1/6/89 cleared. [INFO] 2021-10-26 10:35:02.791 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:35:02 INFO mapreduce.Job: map 100% reduce 100% 21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 completed successfully 21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=2205654 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2630 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=149819 Total time spent by all reduces in occupied slots (ms)=3113 Total time spent by all map tasks (ms)=149819 Total time spent by all reduce tasks (ms)=3113 Total vcore-milliseconds taken by all map tasks=149819 Total vcore-milliseconds taken by all reduce tasks=3113 Total megabyte-milliseconds taken by all map tasks=153414656 Total megabyte-milliseconds taken by all reduce tasks=3187712 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1450 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=6825 CPU time spent (ms)=4980 Physical memory (bytes) snapshot=3529900032 Virtual memory (bytes) snapshot=22377988096 Total committed heap usage (bytes)=2413297664 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 30.695 seconds Estimated value of Pi is 3.20000000000000000000

What you expected to happen

Yarn application_1634958933716_0113 status can always be get; 360截图16380508205068

How to reproduce

Server: KunPeng OS centos7 DS release:1.3.9 Hadoop version :2.9.2 Yarn Ha: False conf/common.properties

resourcemanager port, the default value is 8088 if not specified

resource.manager.httpaddress.port=

if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty

yarn.resourcemanager.ha.rm.ids=

if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname

yarn.application.status.address=http://hadoop47:%s/ws/v1/cluster/apps/%s

Anything else

some times fail ; high probability of this error

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 2 years ago

Search before asking

What happened

operate

when I run a shell for testing mapreduce in ds (fee image below),ds web log shows : yarn status get failed. shell content : hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10 360截图17520712172535

ds web log

[INFO] 2021-10-26 10:34:28.745 - [taskAppId=TASK-1-6-89]:[115] - create dir success /exec/process/1/1/6/89 [INFO] 2021-10-26 10:34:28.754 - [taskAppId=TASK-1-6-89]:[88] - shell task params {"rawScript":"hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10","localParams":[],"resourceList":[]} [INFO] 2021-10-26 10:34:28.758 - [taskAppId=TASK-1-6-89]:[154] - raw script : hadoop jar /opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar pi 10 10 [INFO] 2021-10-26 10:34:28.759 - [taskAppId=TASK-1-6-89]:[155] - task execute path : /exec/process/1/1/6/89 [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[87] - tenantCode user:root, task dir:1_6_89 [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[92] - create command file:/exec/process/1/1/6/89/1_6_89.command [INFO] 2021-10-26 10:34:28.760 - [taskAppId=TASK-1-6-89]:[111] - command : #!/bin/sh BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source /opt/app/dolphinscheduler/conf/env/dolphinscheduler_env.sh /exec/process/1/1/6/89/1_6_89_node.sh [INFO] 2021-10-26 10:34:28.764 - [taskAppId=TASK-1-6-89]:[330] - task run command: sudo -u root sh /exec/process/1/1/6/89/1_6_89.command [INFO] 2021-10-26 10:34:28.773 - [taskAppId=TASK-1-6-89]:[211] - process start, process id is: 19627 [INFO] 2021-10-26 10:34:29.774 - [taskAppId=TASK-1-6-89]:[138] - -> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/app/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/app/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Number of Maps = 10 Samples per Map = 10 [INFO] 2021-10-26 10:34:31.775 - [taskAppId=TASK-1-6-89]:[138] - -> Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 21/10/26 10:34:31 INFO client.RMProxy: Connecting to ResourceManager at hadoop47/192.168.80.47:8032 [INFO] 2021-10-26 10:34:32.776 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:32 INFO input.FileInputFormat: Total input files to process : 10 21/10/26 10:34:32 INFO mapreduce.JobSubmitter: number of splits:10 21/10/26 10:34:32 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 21/10/26 10:34:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1634958933716_0113 21/10/26 10:34:32 INFO impl.YarnClientImpl: Submitted application application_1634958933716_0113 21/10/26 10:34:32 INFO mapreduce.Job: The url to track the job: http://hadoop47:8088/proxy/application_1634958933716_0113/ 21/10/26 10:34:32 INFO mapreduce.Job: Running job: job_1634958933716_0113 [INFO] 2021-10-26 10:34:40.785 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:39 INFO mapreduce.Job: Job job_1634958933716_0113 running in uber mode : false 21/10/26 10:34:39 INFO mapreduce.Job: map 0% reduce 0% [INFO] 2021-10-26 10:34:56.789 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:56 INFO mapreduce.Job: map 30% reduce 0% [INFO] 2021-10-26 10:34:57.790 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:57 INFO mapreduce.Job: map 100% reduce 0% [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[445] - find app id: application_1634958933716_0113 [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[402] - check yarn application status, appId:application_1634958933716_0113 [ERROR] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[418] - yarn applications: application_1634958933716_0113 , query status failed, exception:{} java.lang.NullPointerException: null at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [INFO] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[238] - process has exited, execute path:/exec/process/1/1/6/89, processId:19627 ,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0 [INFO] 2021-10-26 10:35:02.791 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:35:02 INFO mapreduce.Job: map 100% reduce 100% 21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 completed successfully 21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=2205654 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2630 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=149819 Total time spent by all reduces in occupied slots (ms)=3113 Total time spent by all map tasks (ms)=149819 Total time spent by all reduce tasks (ms)=3113 Total vcore-milliseconds taken by all map tasks=149819 Total vcore-milliseconds taken by all reduce tasks=3113 Total megabyte-milliseconds taken by all map tasks=153414656 Total megabyte-milliseconds taken by all reduce tasks=3187712 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1450 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=6825 CPU time spent (ms)=4980 Physical memory (bytes) snapshot=3529900032 Virtual memory (bytes) snapshot=22377988096 Total committed heap usage (bytes)=2413297664 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 30.695 seconds Estimated value of Pi is 3.20000000000000000000

worker debug log

[DEBUG] 2021-10-26 10:34:56.708 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400c9, packet:: clientPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 serverPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 finished:false header:: 2933,4 replyHeader:: 2933,17180717039,0 request:: '/dolphinscheduler/nodes/worker/default/192.168.80.49:1234,T response:: #302e332c302e39312c302e35392c312e33372c382e302c302e332c323032312d31302d32362030393a32373a30362c323032312d31302d32362031303a33343a35362c302c34303937,s{17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701} [DEBUG] 2021-10-26 10:34:56.708 org.apache.dolphinscheduler.service.zk.ZookeeperCachedOperator:[62] - zookeeperListener:org.apache.dolphinscheduler.server.master.registry.ServerNodeManager$WorkerGroupNodeListener triggered [DEBUG] 2021-10-26 10:34:56.709 org.apache.curator.framework.recipes.cache.TreeCache:[396] - processResult: CuratorEventImpl{type=GET_DATA, resultCode=0, path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234', name='null', children=null, context=null, stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701 , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55], watchedEvent=null, aclList=null, opResults=null} [DEBUG] 2021-10-26 10:34:56.709 org.apache.curator.framework.recipes.cache.TreeCache:[857] - publishEvent: TreeCacheEvent{type=NODE_UPDATED, data=ChildData{path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234', stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701 , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55]}} [INFO] 2021-10-26 10:34:56.789 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:56 INFO mapreduce.Job: map 30% reduce 0% [INFO] 2021-10-26 10:34:57.790 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:34:57 INFO mapreduce.Job: map 100% reduce 0% [DEBUG] 2021-10-26 10:34:58.313 org.apache.zookeeper.ClientCnxn:[745] - Got ping response for sessionid: 0x30015c0a38d009d after 0ms [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[445] - find app id: application_1634958933716_0113 [INFO] 2021-10-26 10:35:02.715 - [taskAppId=TASK-1-6-89]:[402] - check yarn application status, appId:application_1634958933716_0113 [DEBUG] 2021-10-26 10:35:02.715 org.apache.dolphinscheduler.common.utils.HadoopUtils:[211] - yarn application url:http://hadoop47:%s/ws/v1/cluster/apps/%s, applicationId:application_1634958933716_0113 [ERROR] 2021-10-26 10:35:02.720 org.apache.dolphinscheduler.common.utils.HttpUtils:[73] - Connect to hadoop47:80 [hadoop47/192.168.80.47] failed: Connection refused (Connection refused) org.apache.http.conn.HttpHostConnectException: Connect to hadoop47:80 [hadoop47/192.168.80.47] failed: Connection refused (Connection refused) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at org.apache.dolphinscheduler.common.utils.HttpUtils.get(HttpUtils.java:60) at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:420) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) at java.net.Socket.connect(Socket.java:606) at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134) ... 20 common frames omitted [ERROR] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[418] - yarn applications: application_1634958933716_0113 , query status failed, exception:{} java.lang.NullPointerException: null at org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [INFO] 2021-10-26 10:35:02.720 - [taskAppId=TASK-1-6-89]:[238] - process has exited, execute path:/exec/process/1/1/6/89, processId:19627 ,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0 [INFO] 2021-10-26 10:35:02.720 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[147] - task instance id : 89,task final status : FAILURE [INFO] 2021-10-26 10:35:02.721 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[185] - develop mode is: false [INFO] 2021-10-26 10:35:02.721 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[203] - exec local path: /exec/process/1/1/6/89 cleared. [INFO] 2021-10-26 10:35:02.791 - [taskAppId=TASK-1-6-89]:[138] - -> 21/10/26 10:35:02 INFO mapreduce.Job: map 100% reduce 100% 21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 completed successfully 21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=2205654 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2630 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=149819 Total time spent by all reduces in occupied slots (ms)=3113 Total time spent by all map tasks (ms)=149819 Total time spent by all reduce tasks (ms)=3113 Total vcore-milliseconds taken by all map tasks=149819 Total vcore-milliseconds taken by all reduce tasks=3113 Total megabyte-milliseconds taken by all map tasks=153414656 Total megabyte-milliseconds taken by all reduce tasks=3187712 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1450 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=6825 CPU time spent (ms)=4980 Physical memory (bytes) snapshot=3529900032 Virtual memory (bytes) snapshot=22377988096 Total committed heap usage (bytes)=2413297664 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 30.695 seconds Estimated value of Pi is 3.20000000000000000000

What you expected to happen

Yarn application_1634958933716_0113 status can always be get; 360截图16380508205068

How to reproduce

Server: KunPeng OS centos7 DS release:1.3.9 Hadoop version :2.9.2 Yarn Ha: False conf/common.properties

resourcemanager port, the default value is 8088 if not specified

resource.manager.httpaddress.port=

if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty

yarn.resourcemanager.ha.rm.ids=

if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname

yarn.application.status.address=http://hadoop47:%s/ws/v1/cluster/apps/%s

Anything else

some times fail ; high probability of this error

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 2 years ago

Hi:

KingSpring commented 2 years ago

I write log in code org.apache.dolphinscheduler.common.utils.HadoopUtils#getApplicationUrl logger.info("yarn application url:{}", String.format(appUrl, activeResourceManagerPort, applicationId)); the excetption in log is :

----------------log begin---------------------------- [INFO] 2021-10-26 16:46:26.257 - [taskAppId=TASK-1-6-91]:[402] - check yarn application status, appId:application_1634958933716_0116 [INFO] 2021-10-26 16:46:26.300 org.apache.dolphinscheduler.common.utils.PropertyUtils:[140] - For input string: "" java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:592) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.common.utils.PropertyUtils.getInt(PropertyUtils.java:138) at org.apache.dolphinscheduler.common.utils.HadoopUtils.(HadoopUtils.java:79) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404) at org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230) at org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [INFO] 2021-10-26 16:46:26.412 - [taskAppId=TASK-1-6-91]:[132] - FINALIZE_SESSION [DEBUG] 2021-10-26 16:46:26.474 org.apache.hadoop.security.authentication.util.KerberosName:[88] - Kerberos krb5 configuration not found, setting default realm to empty [DEBUG] 2021-10-26 16:46:26.739 org.apache.hadoop.util.PerformanceAdvisory:[41] - Falling back to shell based [DEBUG] 2021-10-26 16:46:27.005 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400d1, packet:: clientPath:null serverPath:null finished:false header:: 87,12 replyHeader:: 87,17180767532,0 request:: '/dolphinscheduler/nodes/worker,F response:: v{'default},s{17179869680,17180767372,1634811753491,1635237921707,110,1,0,0,0,1,17179869699} [DEBUG] 2021-10-26 16:46:27.006 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400d1, packet:: clientPath:null serverPath:null finished:false header:: 88,12 replyHeader:: 88,17180767532,0 request:: '/dolphinscheduler/nodes/worker/default,F response:: v{'192.168.80.49:1234,'192.168.80.48:1234},s{17179869699,17179869699,1634811757772,1634811757772,0,90,0,0,0,2,17180767373} [DEBUG] 2021-10-26 16:46:27.007 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400d1, packet:: clientPath:null serverPath:null finished:false header:: 89,4 replyHeader:: 89,17180767532,0 request:: '/dolphinscheduler/nodes/worker/default/192.168.80.49:1234,F response:: #302e342c302e38392c302e37392c312e36322c382e302c302e332c323032312d31302d32362031363a34353a32312c323032312d31302d32362031363a34363a32322c302c3137373636,s{17180767373,17180767520,1635237922183,1635237982207,6,0,0,72064562608931011,74,0,17180767373} [DEBUG] 2021-10-26 16:46:27.008 org.apache.zookeeper.ClientCnxn:[846] - Reading reply sessionid:0x20015bfe8a400d1, packet:: clientPath:null serverPath:null finished:false header:: 90,4 replyHeader:: 90,17180767532,0 request:: '/dolphinscheduler/nodes/worker/default/192.168.80.48:1234,F response:: #302e382c302e37392c342e38352c332e31372c382e302c302e332c323032312d31302d32362031363a34343a34332c323032312d31302d32362031363a34363a32342c302c3235333330,s{17180767272,17180767526,1635237883984,1635237984013,10,0,0,216196699235614882,74,0,17180767272} [DEBUG] 2021-10-26 16:46:27.103 org.apache.hadoop.util.PerformanceAdvisory:[110] - Both short-circuit local reads and UNIX domain socket are disabled. [DEBUG] 2021-10-26 16:46:27.108 org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil:[183] - DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection [DEBUG] 2021-10-26 16:46:27.196 org.apache.dolphinscheduler.common.utils.HadoopUtils:[211] - yarn application url:http://hadoop47:%s/ws/v1/cluster/apps/%s, applicationId:application_1634958933716_0116 ----------------log end ----------------------------

then I set conf/common.propertes ####### resourcemanager port, the default value is 8088 if not specified^M resource.manager.httpaddress.port=8088

then DS workflow is OK , but I don't know why; I don't know my config is set wrong or DS code has bugs ?

KingSpring commented 2 years ago

image image shows when resource.manager.httpaddress.port in commen.properties is empty, dolphin can't get the right connection to yarn , it only connect to the rm ip with no port in the yarn url. I think this is a bug ,as least the document shold update to avoid this problem.

zhongjiajie commented 2 years ago

@lenboo What do you think? It's same as we met yesterday in wechat discuss?

lishiyucn commented 2 years ago

你们自己测试都不用这个功能吗?感觉都是一些很低级且必现的错误,尤其是版本升级过来的时候,旧版本没有这个参数,新版本默认可以不加这个参数

KingSpring commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。