Open zhangzhenhuajack opened 1 year ago
Dear @zhangzhenhuajack, Thanks for opening the issue. This should be solved by PR #95. Feel free to repoen the issue if you still face problems.
when I git pull update , still has error. @GlassOfWhiskey
2023-03-17 16:04:24.132 ERROR Error transferring file /home/zhenhua.zhang/tmp/79022d8f-319d-4c6c-856f-a5d0151f0fa4/cs in location openmpi-rel-574588bf6b-hrjjw:openmpi to /home/zhenhua.zhang/tmp/410468d9-63dd-4981-ae79-7dac1fd0e2ec/c549d3c3-e74a-41f0-b836-c17675dc79da/cs in location helm-mpi/openmpi/openmpi-rel-574588bf6b-v8mls:openmpi
streamflow run streamflow_jk.yml
~/streamflow/streamflow/examples/mpi on master ?1 > streamflow run streamflow_jk.yml py cwl with zhenhua.zhang@mixbio-dev-2 at 16:03:56
Resolved 'cwl/main.cwl' to 'file:///data/home/zhenhua.zhang/streamflow/streamflow/examples/mpi/cwl/main.cwl'
2023-03-17 16:04:00.295 INFO Processing workflow 2e8842c3-15ba-4e59-8fee-947de39ca7fb
2023-03-17 16:04:00.295 INFO Building workflow execution plan
2023-03-17 16:04:00.330 INFO COMPLETED Building of workflow execution plan
2023-03-17 16:04:00.330 INFO Running workflow 2e8842c3-15ba-4e59-8fee-947de39ca7fb
2023-03-17 16:04:00.426 INFO DEPLOYING helm-mpi
2023-03-17 16:04:22.051 INFO COMPLETED Deployment of helm-mpi
2023-03-17 16:04:22.298 INFO COPYING /data/home/zhenhua.zhang/streamflow/streamflow/examples/mpi/cwl/data/cs.cxx on local file-system to /home/zhenhua.zhang/tmp/85d69ceb-5d2d-42bb-91bc-47eb492ffec5/6f864633-a789-4433-8b72-206901962211/cs.cxx on location helm-mpi/openmpi/openmpi-rel-574588bf6b-hrjjw:openmpi
2023-03-17 16:04:22.504 INFO EXECUTING step /compile (job /compile/0) on location helm-mpi/openmpi/openmpi-rel-574588bf6b-hrjjw:openmpi into directory /home/zhenhua.zhang/tmp/79022d8f-319d-4c6c-856f-a5d0151f0fa4:
mpicxx \
-O3 \
-o \
cs \
/home/zhenhua.zhang/tmp/85d69ceb-5d2d-42bb-91bc-47eb492ffec5/6f864633-a789-4433-8b72-206901962211/cs.cxx
2023-03-17 16:04:23.656 INFO COMPLETED Step /compile
2023-03-17 16:04:23.944 INFO COPYING /home/zhenhua.zhang/tmp/79022d8f-319d-4c6c-856f-a5d0151f0fa4/cs on location helm-mpi/openmpi/openmpi-rel-574588bf6b-hrjjw:openmpi to /home/zhenhua.zhang/tmp/410468d9-63dd-4981-ae79-7dac1fd0e2ec/c549d3c3-e74a-41f0-b836-c17675dc79da/cs on location helm-mpi/openmpi/openmpi-rel-574588bf6b-v8mls:openmpi
2023-03-17 16:04:24.132 ERROR Error transferring file /home/zhenhua.zhang/tmp/79022d8f-319d-4c6c-856f-a5d0151f0fa4/cs in location openmpi-rel-574588bf6b-hrjjw:openmpi to /home/zhenhua.zhang/tmp/410468d9-63dd-4981-ae79-7dac1fd0e2ec/c549d3c3-e74a-41f0-b836-c17675dc79da/cs in location helm-mpi/openmpi/openmpi-rel-574588bf6b-v8mls:openmpi
Traceback (most recent call last):
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/workflow/step.py", line 1458, in run
token=await self.transfer(job, token),
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 551, in transfer
return token.update(await self._transfer_value(job, token.value))
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 360, in _transfer_value
return await self._update_file_token(job, token_value)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 458, in _update_file_token
raise WorkflowExecutionException(
streamflow.core.exception.WorkflowExecutionException: Error transferring file /home/zhenhua.zhang/tmp/79022d8f-319d-4c6c-856f-a5d0151f0fa4/cs in location openmpi-rel-574588bf6b-hrjjw:openmpi to /home/zhenhua.zhang/tmp/410468d9-63dd-4981-ae79-7dac1fd0e2ec/c549d3c3-e74a-41f0-b836-c17675dc79da/cs in location helm-mpi/openmpi/openmpi-rel-574588bf6b-v8mls:openmpi
2023-03-17 16:04:24.138 INFO SKIPPED Step /execute
2023-03-17 16:04:24.139 INFO UNDEPLOYING helm-mpi
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/zhenhua.zhang/.kube/config-streamflow
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/zhenhua.zhang/.kube/config-streamflow
release "openmpi-rel" uninstalled
2023-03-17 16:04:24.428 INFO COMPLETED Undeployment of helm-mpi
2023-03-17 16:04:24.517 ERROR FAILED Workflow execution
Traceback (most recent call last):
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/main.py", line 258, in main
asyncio.run(_async_run(args))
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/main.py", line 166, in _async_run
await asyncio.gather(*workflow_tasks)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/main.py", line 74, in main
output_tokens = await executor.run()
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/workflow/executor.py", line 135, in run
raise WorkflowExecutionException("FAILED Workflow execution")
streamflow.core.exception.WorkflowExecutionException: FAILED Workflow execution
~/streamflow/streamflow/examples/mpi on master ?1 > cat streamflow_jk.yml PIPE took 23s py cwl with zhenhua.zhang@mixbio-dev-2 at 16:08:38
version: v1.0
workflows:
master:
type: cwl
config:
file: cwl/main.cwl
settings: cwl/config.yml
bindings:
- step: /compile
target:
deployment: helm-mpi
service: openmpi
- step: /execute
target:
deployment: helm-mpi
locations: 2
service: openmpi
deployments:
dc-mpi:
type: docker-compose
config:
files:
- environment/docker-compose/docker-compose.yml
compatibility: true
projectName: openmpi
helm-mpi:
type: helm
config:
chart: environment/helm/openmpi
kubeconfig: /home/zhenhua.zhang/.kube/config-streamflow
releaseName: openmpi-rel
transferBufferSize: 10240
namespace: streamflow
workdir: /home/zhenhua.zhang/tmp
Hi @zhangzhenhuajack, I repoened the issue, but I am not able to reproduce the error anymore. Could you please run streamflow in debug mode?
streamflow run --debug streamflow_jk.yml
In this way, I can get a bit more information to reproduce it. Thank you.
~/streamflow/streamflow/examples/mpi on master !1 ?1 > streamflow run --debug streamflow_jk.yml py cwl with zhenhua.zhang@mixbio-dev-2 at 09:37:47
Resolved 'cwl/main.cwl' to 'file:///data/home/zhenhua.zhang/streamflow/streamflow/examples/mpi/cwl/main.cwl'
2023-03-20 09:38:05.790 INFO Processing workflow 6bd446a7-3b12-4280-9dbe-d222c1d581b8
2023-03-20 09:38:05.790 INFO Building workflow execution plan
2023-03-20 09:38:05.790 DEBUG Translating Workflow /
2023-03-20 09:38:05.791 DEBUG Translating WorkflowStep /compile
2023-03-20 09:38:05.791 DEBUG Translating CommandLineTool /compile
2023-03-20 09:38:05.791 DEBUG Translating WorkflowStep /execute
2023-03-20 09:38:05.792 DEBUG Translating CommandLineTool /execute
2023-03-20 09:38:05.805 INFO COMPLETED Building of workflow execution plan
2023-03-20 09:38:05.805 INFO Running workflow 6bd446a7-3b12-4280-9dbe-d222c1d581b8
2023-03-20 09:38:05.806 DEBUG Step /num_processes-injector received inputs ['0']
2023-03-20 09:38:05.807 DEBUG Step /source_file-injector received inputs ['0']
2023-03-20 09:38:05.807 DEBUG Retrieving available locations for job /num_processes-injector/0 on __LOCAL__.
2023-03-20 09:38:05.807 DEBUG Available locations for job /num_processes-injector/0 on __LOCAL__ are ['__LOCAL__'].
2023-03-20 09:38:05.808 DEBUG Job /num_processes-injector/0 allocated locally
2023-03-20 09:38:05.808 DEBUG Retrieving available locations for job /source_file-injector/0 on __LOCAL__.
2023-03-20 09:38:05.808 DEBUG Available locations for job /source_file-injector/0 on __LOCAL__ are ['__LOCAL__'].
2023-03-20 09:38:05.808 DEBUG Job /source_file-injector/0 allocated locally
2023-03-20 09:38:05.808 DEBUG COMPLETED Step __deploy__/helm-mpi
2023-03-20 09:38:05.808 DEBUG COMPLETED Step __deploy__/__LOCAL__
2023-03-20 09:38:05.809 DEBUG Job /num_processes-injector/0 changed status to RUNNING
2023-03-20 09:38:05.809 DEBUG Job /source_file-injector/0 changed status to RUNNING
2023-03-20 09:38:05.885 DEBUG COMPLETED Step /num_processes-injector/__schedule__
2023-03-20 09:38:05.885 DEBUG COMPLETED Step /source_file-injector/__schedule__
2023-03-20 09:38:05.889 DEBUG Job /num_processes-injector/0 changed status to COMPLETED
2023-03-20 09:38:05.889 DEBUG Step /num_processes-token-transformer received inputs ['0']
2023-03-20 09:38:05.889 DEBUG Step /num_processes-injector received inputs ['0']
2023-03-20 09:38:05.890 DEBUG COMPLETED Step /num_processes-injector
2023-03-20 09:38:05.890 DEBUG Step /num_processes-token-transformer received inputs ['0']
2023-03-20 09:38:05.890 DEBUG COMPLETED Step /num_processes-token-transformer
2023-03-20 09:38:05.890 DEBUG Job /source_file-injector/0 changed status to COMPLETED
2023-03-20 09:38:05.890 DEBUG Step /source_file-token-transformer received inputs ['0']
2023-03-20 09:38:05.891 DEBUG Step /source_file-injector received inputs ['0']
2023-03-20 09:38:05.891 DEBUG COMPLETED Step /source_file-injector
2023-03-20 09:38:05.892 DEBUG Step /compile/source_file-token-transformer received inputs ['0']
2023-03-20 09:38:05.892 DEBUG Step /source_file-token-transformer received inputs ['0']
2023-03-20 09:38:05.892 DEBUG COMPLETED Step /source_file-token-transformer
2023-03-20 09:38:05.893 DEBUG Step /compile/__transfer__/source_file received inputs ['0']
2023-03-20 09:38:05.893 DEBUG Step /compile/__schedule__ received inputs ['0']
2023-03-20 09:38:05.893 DEBUG Step /compile/source_file-token-transformer received inputs ['0']
2023-03-20 09:38:05.893 DEBUG Retrieving available locations for job /compile/0 on helm-mpi/openmpi.
2023-03-20 09:38:05.894 INFO DEPLOYING helm-mpi
2023-03-20 09:38:05.913 DEBUG COMPLETED Step /compile/source_file-token-transformer
2023-03-20 09:38:05.940 DEBUG EXECUTING helm --kubeconfig "/home/zhenhua.zhang/.kube/config-streamflow" --namespace "streamflow" --registry-config "/home/zhenhua.zhang/.config/helm/registry.json" --repository-cache "/home/zhenhua.zhang/.cache/helm/repository" --repository-config "/home/zhenhua.zhang/.config/helm/repositories.yaml" install --timeout "1000m" --wait openmpi-rel environment/helm/openmpi
2023-03-20 09:39:18.332 INFO COMPLETED Deployment of helm-mpi
2023-03-20 09:39:18.350 DEBUG Available locations for job /compile/0 on helm-mpi/openmpi are ['openmpi-rel-574588bf6b-jh2kl:openmpi', 'openmpi-rel-574588bf6b-lh5g2:openmpi'].
2023-03-20 09:39:18.350 DEBUG Job /compile/0 allocated on location helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.351 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3 /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7 /home/zhenhua.zhang/tmp/eac62e7b-63d4-4e55-b970-1a5bb9b2d527 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.455 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.456 DEBUG Step /compile/__schedule__ received inputs ['0']
2023-03-20 09:39:18.456 DEBUG COMPLETED Step /compile/__schedule__
2023-03-20 09:39:18.499 INFO COPYING /data/home/zhenhua.zhang/streamflow/streamflow/examples/mpi/cwl/data/cs.cxx on local file-system to /home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx on location helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.527 DEBUG EXECUTING command test -f "/home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx" && sha1sum "/home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx" | awk '{print $1}' 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.581 DEBUG Step /compile received inputs ['0']
2023-03-20 09:39:18.581 DEBUG Step /compile/__transfer__/source_file received inputs ['0']
2023-03-20 09:39:18.582 DEBUG Job /compile/0 started
2023-03-20 09:39:18.582 DEBUG Job /compile/0 changed status to RUNNING
2023-03-20 09:39:18.582 DEBUG Job /compile/0 inputs: {
"source_file": {
"basename": "cs.cxx",
"checksum": "sha1$958eebef8f8a9a7ab46e0009caffcd754d0f255a",
"class": "File",
"dirname": "/home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26",
"location": "file:///home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx",
"nameext": ".cxx",
"nameroot": "cs",
"path": "/home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx",
"size": 3037
}
}
2023-03-20 09:39:18.583 INFO EXECUTING step /compile (job /compile/0) on location helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi into directory /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7:
mpicxx \
-O3 \
-o \
cs \
/home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx
2023-03-20 09:39:18.583 DEBUG COMPLETED Step /compile/__transfer__/source_file
2023-03-20 09:39:18.583 DEBUG Step /compile received inputs ['0']
2023-03-20 09:39:18.583 DEBUG EXECUTING command cd /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7 && export HOME="/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7" && export TMPDIR="/home/zhenhua.zhang/tmp/eac62e7b-63d4-4e55-b970-1a5bb9b2d527" && mpicxx -O3 -o cs /home/zhenhua.zhang/tmp/746ff75c-a2b8-4e69-8c1a-6ba429ea4bf3/04029cea-248f-4a47-8431-ecdc29a8db26/cs.cxx 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi for job /compile/0
2023-03-20 09:39:18.827 DEBUG EXECUTING command test -e "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cwl.output.json" 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.874 DEBUG Job /compile/0 changed status to COMPLETED
2023-03-20 09:39:18.875 DEBUG EXECUTING command printf "%s\0" /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs | xargs -0 -I{} sh -c "if [ -e \"{}\" ]; then echo \"{}\"; fi" | sort 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.917 DEBUG EXECUTING command test -e "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" && readlink -f "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:18.988 DEBUG EXECUTING command test -f "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.036 DEBUG EXECUTING command test -e "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" && readlink -f "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.080 DEBUG EXECUTING command find -L "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" -type f -exec ls -ln {} \+ | awk 'BEGIN {sum=0} {sum+=$5} END {print sum}'; 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.158 DEBUG EXECUTING command test -f "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" && sha1sum "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" | awk '{print $1}' 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.199 DEBUG EXECUTING command test -e "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" && readlink -f "/home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs" 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.241 DEBUG COMPLETED Job /compile/0 terminated
2023-03-20 09:39:19.241 DEBUG Step /execute/executable_file-token-transformer received inputs ['0', '0']
2023-03-20 09:39:19.242 DEBUG Step /execute/num_processes-token-transformer received inputs ['0', '0']
2023-03-20 09:39:19.242 DEBUG Step /execute/__transfer__/num_processes received inputs ['0']
2023-03-20 09:39:19.243 DEBUG Step /execute/num_processes-token-transformer received inputs ['0', '0']
2023-03-20 09:39:19.243 INFO COMPLETED Step /compile
2023-03-20 09:39:19.243 DEBUG COMPLETED Step /execute/num_processes-token-transformer
2023-03-20 09:39:19.244 DEBUG Step /execute/__transfer__/executable_file received inputs ['0']
2023-03-20 09:39:19.244 DEBUG Step /execute/__schedule__ received inputs ['0', '0']
2023-03-20 09:39:19.244 DEBUG Step /execute/executable_file-token-transformer received inputs ['0', '0']
2023-03-20 09:39:19.244 DEBUG Retrieving available locations for job /execute/0 on helm-mpi/openmpi.
2023-03-20 09:39:19.244 DEBUG Available locations for job /execute/0 on helm-mpi/openmpi are ['openmpi-rel-574588bf6b-jh2kl:openmpi', 'openmpi-rel-574588bf6b-lh5g2:openmpi'].
2023-03-20 09:39:19.245 DEBUG helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpiJob /execute/0 allocated on locations , helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.245 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0 /home/zhenhua.zhang/tmp/ed879a2c-3e25-48bd-8bff-4a179f7948b4 /home/zhenhua.zhang/tmp/b77d0947-658a-4379-a905-e429d8dd66e1 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.246 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0 /home/zhenhua.zhang/tmp/ed879a2c-3e25-48bd-8bff-4a179f7948b4 /home/zhenhua.zhang/tmp/b77d0947-658a-4379-a905-e429d8dd66e1 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.246 DEBUG COMPLETED Step /execute/executable_file-token-transformer
2023-03-20 09:39:19.399 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.401 DEBUG EXECUTING command mkdir -p /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.401 DEBUG Step /execute/__schedule__ received inputs ['0', '0']
2023-03-20 09:39:19.402 DEBUG COMPLETED Step /execute/__schedule__
2023-03-20 09:39:19.403 DEBUG Step /execute/__transfer__/num_processes received inputs ['0']
2023-03-20 09:39:19.403 DEBUG COMPLETED Step /execute/__transfer__/num_processes
2023-03-20 09:39:19.480 DEBUG EXECUTING command ln -snf /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi
2023-03-20 09:39:19.521 INFO COPYING /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs on location helm-mpi/openmpi/openmpi-rel-574588bf6b-jh2kl:openmpi to /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs on location helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.567 DEBUG EXECUTING command test -f "/home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs" && sha1sum "/home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs" | awk '{print $1}' 2>&1 on helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.649 ERROR Error transferring file /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs in location openmpi-rel-574588bf6b-jh2kl:openmpi to /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs in location helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
Traceback (most recent call last):
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/workflow/step.py", line 1458, in run
token=await self.transfer(job, token),
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 551, in transfer
return token.update(await self._transfer_value(job, token.value))
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 360, in _transfer_value
return await self._update_file_token(job, token_value)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/step.py", line 458, in _update_file_token
raise WorkflowExecutionException(
streamflow.core.exception.WorkflowExecutionException: Error transferring file /home/zhenhua.zhang/tmp/ab23f1f8-8a07-4d87-a936-2c3f210770d7/cs in location openmpi-rel-574588bf6b-jh2kl:openmpi to /home/zhenhua.zhang/tmp/b1b6c462-4447-43aa-b74d-1282ad4db5e0/3e0932b0-1bd3-4cb9-a689-0f3bd3eb981e/cs in location helm-mpi/openmpi/openmpi-rel-574588bf6b-lh5g2:openmpi
2023-03-20 09:39:19.650 DEBUG Step /execute received inputs ['0', '0']
2023-03-20 09:39:19.650 DEBUG Step result-collector-transformer received inputs ['0']
2023-03-20 09:39:19.650 DEBUG Step result-collector received inputs ['0']
2023-03-20 09:39:19.650 DEBUG Step result-collector/__schedule__ received inputs ['0']
2023-03-20 09:39:19.651 DEBUG FAILED Step /execute/__transfer__/executable_file
2023-03-20 09:39:19.651 INFO SKIPPED Step /execute
2023-03-20 09:39:19.651 DEBUG SKIPPED Step result-collector-transformer
2023-03-20 09:39:19.651 DEBUG SKIPPED Step result-collector
2023-03-20 09:39:19.651 DEBUG SKIPPED Step result-collector/__schedule__
2023-03-20 09:39:19.652 INFO UNDEPLOYING helm-mpi
2023-03-20 09:39:19.652 DEBUG EXECUTING helm --kubeconfig "/home/zhenhua.zhang/.kube/config-streamflow" --namespace "streamflow" --registry-config "/home/zhenhua.zhang/.config/helm/registry.json" --repository-cache "/home/zhenhua.zhang/.cache/helm/repository" --repository-config "/home/zhenhua.zhang/.config/helm/repositories.yaml" uninstall --timeout "1000m" openmpi-rel
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/zhenhua.zhang/.kube/config-streamflow
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/zhenhua.zhang/.kube/config-streamflow
release "openmpi-rel" uninstalled
2023-03-20 09:39:19.928 INFO COMPLETED Undeployment of helm-mpi
2023-03-20 09:39:19.965 ERROR FAILED Workflow execution
Traceback (most recent call last):
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/main.py", line 258, in main
asyncio.run(_async_run(args))
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/main.py", line 166, in _async_run
await asyncio.gather(*workflow_tasks)
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/cwl/main.py", line 74, in main
output_tokens = await executor.run()
File "/home/zhenhua.zhang/miniconda3/envs/cwl/lib/python3.9/site-packages/streamflow/workflow/executor.py", line 135, in run
raise WorkflowExecutionException("FAILED Workflow execution")
streamflow.core.exception.WorkflowExecutionException: FAILED Workflow execution
I think copy file is error, helm not support copy file to k8s's pod,maybe need to used kubectl cp
command to cp file from location to k8s's pod or from k8s's pod to location。
I have other question is : Whether PVC needs to be configured to solve location file communication problems with k8s
this pod is running,but pod can not interact with location's file. @GlassOfWhiskey
if streamflow can directly support kubectl
command,my personal tasks may be more powerful than helm。I may be think: 'Helm focuses on managing YML, and Kubeclt focuses on managing k8s's sources,like( pods,pvc)'
Hi @zhangzhenhuajack,
would you please try again with streamflow==0.2.0.dev4
, the newly released version?
I found and solved a bug that was causing file transfer issues on Kubernetes sometimes, due to some kind of race condition.
I'd like to know if the fix I implemented also solves the problem for you.
Oh BTW StreamFlow 0.2.0.dev4 also as a new Kubernetes connector, which doesn't require Helm charts but only a list of Kubernetes .yaml
manifests.