apache / dolphinscheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
https://dolphinscheduler.apache.org/
Apache License 2.0
12.68k stars 4.58k forks source link

[Bug] [remoteshell] Remoteshell cannot execute commands normally,The error message is as follows #16581

Open zhangconan opened 1 week ago

zhangconan commented 1 week ago

Search before asking

What happened

I wrote the following command in the remoteshell script:

touch zkn.txt;

The error message is as follows:

[INFO] 2024-09-04 14:21:11.228 +0800 -


[INFO] 2024-09-04 14:21:11.232 +0800 - * Initialize task context *** [INFO] 2024-09-04 14:21:11.233 +0800 - ***** [INFO] 2024-09-04 14:21:11.233 +0800 - Begin to initialize task [INFO] 2024-09-04 14:21:11.233 +0800 - Set task startTime: 1725430871233 [INFO] 2024-09-04 14:21:11.233 +0800 - Set task appId: 59_73 [INFO] 2024-09-04 14:21:11.234 +0800 - End initialize task { "taskInstanceId" : 73, "taskName" : "远程shell", "firstSubmitTime" : 1725430871208, "startTime" : 1725430871233, "taskType" : "REMOTESHELL", "workflowInstanceHost" : "172.18.0.1:5678", "host" : "172.18.0.1:1234", "logPath" : "/home/dolphinscheduler/tmp/dolphinscheduler/worker-server/logs/20240904/117989902358848/4/59/73.log", "processId" : 0, "processDefineCode" : 117989902358848, "processDefineVersion" : 4, "processInstanceId" : 59, "scheduleTime" : 0, "executorId" : 1, "cmdTypeIfComplement" : 0, "tenantCode" : "default", "processDefineId" : 0, "projectId" : 0, "projectCode" : 117107289483392, "taskParams" : "{\"localParams\":[],\"rawScript\":\"touch zkn.txt\",\"resourceList\":[],\"type\":\"SSH\",\"datasource\":4}", "prepareParamsMap" : { "system.task.definition.name" : { "prop" : "system.task.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "远程shell" }, "system.project.name" : { "prop" : "system.project.name", "direct" : "IN", "type" : "VARCHAR", "value" : null }, "system.project.code" : { "prop" : "system.project.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117107289483392" }, "system.workflow.instance.id" : { "prop" : "system.workflow.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "59" }, "system.biz.curdate" : { "prop" : "system.biz.curdate", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904" }, "system.biz.date" : { "prop" : "system.biz.date", "direct" : "IN", "type" : "VARCHAR", "value" : "20240903" }, "system.task.instance.id" : { "prop" : "system.task.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "73" }, "system.workflow.definition.name" : { "prop" : "system.workflow.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "张可南测试" }, "system.task.definition.code" : { "prop" : "system.task.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "118774317966656" }, "system.workflow.definition.code" : { "prop" : "system.workflow.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117989902358848" }, "system.datetime" : { "prop" : "system.datetime", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904142111" } }, "taskAppId" : "59_73", "taskTimeout" : 2147483647, "workerGroup" : "default", "delayTime" : 0, "currentExecutionStatus" : "SUBMITTED_SUCCESS", "resourceParametersHelper" : { "resourceMap" : { "DATASOURCE" : { "4" : { "resourceType" : "DATASOURCE", "type" : "SSH", "connectionParams" : "{\"user\":\"root\",\"password\":\"*\",\"host\":\"192.168.200.127\",\"port\":22}", "DATASOURCE" : null } } } }, "endTime" : 0, "dryRun" : 0, "paramsMap" : { }, "cpuQuota" : -1, "memoryMax" : -1, "testFlag" : 0, "logBufferEnable" : false, "dispatchFailTimes" : 0 } [INFO] 2024-09-04 14:21:11.236 +0800 -


[INFO] 2024-09-04 14:21:11.237 +0800 - Load task instance plugin [INFO] 2024-09-04 14:21:11.237 +0800 - *** [INFO] 2024-09-04 14:21:11.240 +0800 - Send task status RUNNING_EXECUTION master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:11.241 +0800 - TenantCode: default check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - WorkflowInstanceExecDir: /tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73 check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Create TaskChannel: org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTaskChannel successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Download resources successfully: ResourceContext(resourceItemMap={}) [INFO] 2024-09-04 14:21:11.245 +0800 - Download upstream files: [] successfully [INFO] 2024-09-04 14:21:11.245 +0800 - Task plugin instance: REMOTESHELL create successfully [INFO] 2024-09-04 14:21:11.245 +0800 - shell task params {"localParams":[],"rawScript":"touch zkn.txt","resourceList":[],"type":"SSH","datasource":4} [INFO] 2024-09-04 14:21:11.251 +0800 - Success initialized task plugin instance successfully [INFO] 2024-09-04 14:21:11.252 +0800 - Set taskVarPool: null successfully [INFO] 2024-09-04 14:21:11.253 +0800 -


[INFO] 2024-09-04 14:21:11.253 +0800 - * Execute task instance ***** [INFO] 2024-09-04 14:21:11.253 +0800 - *** [INFO] 2024-09-04 14:21:11.255 +0800 - raw script : #!/bin/bash touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:11.716 +0800 - upload script from local:/tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73/59_73_node.sh to remote: /tmp/dolphinscheduler-remote-shell-root/dolphinscheduler-remoteshell-73.sh [INFO] 2024-09-04 14:21:12.122 +0800 - The final script is:

!/bin/bash

touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:12.197 +0800 - Remote shell task log: [INFO] 2024-09-04 14:21:12.391 +0800 - DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0

[INFO] 2024-09-04 14:21:12.465 +0800 - Remote shell task run status: DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0

[ERROR] 2024-09-04 14:21:12.465 +0800 - Remote shell task failed [ERROR] 2024-09-04 14:21:12.468 +0800 - shell task error org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [ERROR] 2024-09-04 14:21:12.469 +0800 - Task execute failed, due to meet an exception org.apache.dolphinscheduler.plugin.task.api.TaskException: Execute shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:110) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) ... 5 common frames omitted Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [INFO] 2024-09-04 14:21:12.469 +0800 - kill remote task dolphinscheduler-remoteshell-73 [ERROR] 2024-09-04 14:21:12.470 +0800 - Cancel task failed, this will not affect the taskInstance status, but you need to check manual org.apache.dolphinscheduler.plugin.task.api.TaskException: cancel application error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:121) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.cancelTask(WorkerTaskExecutor.java:133) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.afterThrowing(WorkerTaskExecutor.java:114) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.afterThrowing(DefaultWorkerTaskExecutor.java:61) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:179) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: SSH connection failed at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:82) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.runRemote(RemoteExecutor.java:224) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskPid(RemoteExecutor.java:200) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.kill(RemoteExecutor.java:158) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:119) ... 7 common frames omitted Caused by: java.lang.IllegalStateException: SshClient not started. Please call start() method before connecting to a server at org.apache.sshd.client.SshClient.doConnect(SshClient.java:627) at org.apache.sshd.client.SshClient.doConnect(SshClient.java:616) at org.apache.sshd.client.SshClient.connect(SshClient.java:547) at org.apache.sshd.client.SshClient.connect(SshClient.java:539) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:74) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:57) at org.apache.dolphinscheduler.plugin.datasource.ssh.SSHUtils.getSession(SSHUtils.java:41) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:77) ... 11 common frames omitted [INFO] 2024-09-04 14:21:12.472 +0800 - Get a exception when execute the task, will send the task status: FAILURE to master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:12.472 +0800 - FINALIZE_SESSION

What you expected to happen

I hope the command can be executed normally and the task status returns success。

How to reproduce

You only need to find a Linux server and configure it in the remoteshell task node to reproduce it.

Anything else

nothing

Version

3.2.x

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 1 week ago

Search before asking

What happened

I wrote the following command in the remoteshell script:

touch zkn.txt;

The error message is as follows:

[INFO] 2024-09-04 14:21:11.228 +0800 -


[INFO] 2024-09-04 14:21:11.232 +0800 - * Initialize task context *** [INFO] 2024-09-04 14:21:11.233 +0800 - ***** [INFO] 2024-09-04 14:21:11.233 +0800 - Begin to initialize task [INFO] 2024-09-04 14:21:11.233 +0800 - Set task startTime: 1725430871233 [INFO] 2024-09-04 14:21:11.233 +0800 - Set task appId: 59_73 [INFO] 2024-09-04 14:21:11.234 +0800 - End initialize task { "taskInstanceId" : 73, "taskName" : "远程shell", "firstSubmitTime" : 1725430871208, "startTime" : 1725430871233, "taskType" : "REMOTESHELL", "workflowInstanceHost" : "172.18.0.1:5678", "host" : "172.18.0.1:1234", "logPath" : "/home/dolphinscheduler/tmp/dolphinscheduler/worker-server/logs/20240904/117989902358848/4/59/73.log", "processId" : 0, "processDefineCode" : 117989902358848, "processDefineVersion" : 4, "processInstanceId" : 59, "scheduleTime" : 0, "executorId" : 1, "cmdTypeIfComplement" : 0, "tenantCode" : "default", "processDefineId" : 0, "projectId" : 0, "projectCode" : 117107289483392, "taskParams" : "{\"localParams\":[],\"rawScript\":\"touch zkn.txt\",\"resourceList\":[],\"type\":\"SSH\",\"datasource\":4}", "prepareParamsMap" : { "system.task.definition.name" : { "prop" : "system.task.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "远程shell" }, "system.project.name" : { "prop" : "system.project.name", "direct" : "IN", "type" : "VARCHAR", "value" : null }, "system.project.code" : { "prop" : "system.project.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117107289483392" }, "system.workflow.instance.id" : { "prop" : "system.workflow.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "59" }, "system.biz.curdate" : { "prop" : "system.biz.curdate", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904" }, "system.biz.date" : { "prop" : "system.biz.date", "direct" : "IN", "type" : "VARCHAR", "value" : "20240903" }, "system.task.instance.id" : { "prop" : "system.task.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "73" }, "system.workflow.definition.name" : { "prop" : "system.workflow.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "张可南测试" }, "system.task.definition.code" : { "prop" : "system.task.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "118774317966656" }, "system.workflow.definition.code" : { "prop" : "system.workflow.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117989902358848" }, "system.datetime" : { "prop" : "system.datetime", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904142111" } }, "taskAppId" : "59_73", "taskTimeout" : 2147483647, "workerGroup" : "default", "delayTime" : 0, "currentExecutionStatus" : "SUBMITTED_SUCCESS", "resourceParametersHelper" : { "resourceMap" : { "DATASOURCE" : { "4" : { "resourceType" : "DATASOURCE", "type" : "SSH", "connectionParams" : "{\"user\":\"root\",\"password\":\"*\",\"host\":\"192.168.200.127\",\"port\":22}", "DATASOURCE" : null } } } }, "endTime" : 0, "dryRun" : 0, "paramsMap" : { }, "cpuQuota" : -1, "memoryMax" : -1, "testFlag" : 0, "logBufferEnable" : false, "dispatchFailTimes" : 0 } [INFO] 2024-09-04 14:21:11.236 +0800 -


[INFO] 2024-09-04 14:21:11.237 +0800 - Load task instance plugin [INFO] 2024-09-04 14:21:11.237 +0800 - *** [INFO] 2024-09-04 14:21:11.240 +0800 - Send task status RUNNING_EXECUTION master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:11.241 +0800 - TenantCode: default check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - WorkflowInstanceExecDir: /tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73 check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Create TaskChannel: org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTaskChannel successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Download resources successfully: ResourceContext(resourceItemMap={}) [INFO] 2024-09-04 14:21:11.245 +0800 - Download upstream files: [] successfully [INFO] 2024-09-04 14:21:11.245 +0800 - Task plugin instance: REMOTESHELL create successfully [INFO] 2024-09-04 14:21:11.245 +0800 - shell task params {"localParams":[],"rawScript":"touch zkn.txt","resourceList":[],"type":"SSH","datasource":4} [INFO] 2024-09-04 14:21:11.251 +0800 - Success initialized task plugin instance successfully [INFO] 2024-09-04 14:21:11.252 +0800 - Set taskVarPool: null successfully [INFO] 2024-09-04 14:21:11.253 +0800 -


[INFO] 2024-09-04 14:21:11.253 +0800 - * Execute task instance ***** [INFO] 2024-09-04 14:21:11.253 +0800 - *** [INFO] 2024-09-04 14:21:11.255 +0800 - raw script : #!/bin/bash touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:11.716 +0800 - upload script from local:/tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73/59_73_node.sh to remote: /tmp/dolphinscheduler-remote-shell-root/dolphinscheduler-remoteshell-73.sh [INFO] 2024-09-04 14:21:12.122 +0800 - The final script is:

!/bin/bash

touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:12.197 +0800 - Remote shell task log: [INFO] 2024-09-04 14:21:12.391 +0800 - DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0

[INFO] 2024-09-04 14:21:12.465 +0800 - Remote shell task run status: DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0

[ERROR] 2024-09-04 14:21:12.465 +0800 - Remote shell task failed [ERROR] 2024-09-04 14:21:12.468 +0800 - shell task error org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [ERROR] 2024-09-04 14:21:12.469 +0800 - Task execute failed, due to meet an exception org.apache.dolphinscheduler.plugin.task.api.TaskException: Execute shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:110) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) ... 5 common frames omitted Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [INFO] 2024-09-04 14:21:12.469 +0800 - kill remote task dolphinscheduler-remoteshell-73 [ERROR] 2024-09-04 14:21:12.470 +0800 - Cancel task failed, this will not affect the taskInstance status, but you need to check manual org.apache.dolphinscheduler.plugin.task.api.TaskException: cancel application error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:121) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.cancelTask(WorkerTaskExecutor.java:133) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.afterThrowing(WorkerTaskExecutor.java:114) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.afterThrowing(DefaultWorkerTaskExecutor.java:61) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:179) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: SSH connection failed at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:82) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.runRemote(RemoteExecutor.java:224) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskPid(RemoteExecutor.java:200) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.kill(RemoteExecutor.java:158) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:119) ... 7 common frames omitted Caused by: java.lang.IllegalStateException: SshClient not started. Please call start() method before connecting to a server at org.apache.sshd.client.SshClient.doConnect(SshClient.java:627) at org.apache.sshd.client.SshClient.doConnect(SshClient.java:616) at org.apache.sshd.client.SshClient.connect(SshClient.java:547) at org.apache.sshd.client.SshClient.connect(SshClient.java:539) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:74) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:57) at org.apache.dolphinscheduler.plugin.datasource.ssh.SSHUtils.getSession(SSHUtils.java:41) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:77) ... 11 common frames omitted [INFO] 2024-09-04 14:21:12.472 +0800 - Get a exception when execute the task, will send the task status: FAILURE to master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:12.472 +0800 - FINALIZE_SESSION

What you expected to happen

I hope the command can be executed normally and the task status returns success。

How to reproduce

You only need to find a Linux server and configure it in the remoteshell task node to reproduce it.

Anything else

nothing

Version

3.2.x

Are you willing to submit PR?

Code of Conduct

zhangconan commented 1 week ago

This question is more like a usage question. When remoteshell executes sh, it uses the absolute path of sh. If some directory or file operations are involved in sh, the directory needs to be explicitly specified, otherwise these files will be executed in the user directory of the connected server.

zhangconan commented 1 week ago

The real reason why remoteshell execution failed is because the status conversion error occurred when processing taskExitCode.

image
SbloodyS commented 1 week ago

Hi @zhangconan , would you like to fix it?

zhangconan commented 6 days ago

Hi @zhangconan , would you like to fix it?

I see this issue has been resolved on the dev branch。 I have a small suggestion, I hope the official website document can be written in more detail. Where can I find the official communication group about dolphinscheduler? For example, DingTalk, WeChat