Closed pagrubel closed 7 months ago
So, I tested this out, apparently we can use a yml input file that is not in the workflow directory, however the main file we use needs to be the one in the main workflow dir. I get this error when trying to use a different main cwl which maybe doesn't make sense anyway, however we should handle the error better. Notice I'm trying to use the clamr-ffmpeg-build directory which has all the files for the workflow, but select the clamr_wf.cwl in the current directory.
ll
total 23
drwxrwxr-x 2 pagrubel pagrubel 4096 Aug 18 17:02 clamr-ffmpeg-build
-rw-rw-r-- 1 pagrubel pagrubel 404 Aug 21 13:56 clamr_job.yml
-rw-rw-r-- 1 pagrubel pagrubel 1882 Aug 21 13:55 clamr_wf.cwl
-rw-r--r-- 1 pagrubel pagrubel 4791 Aug 21 15:03 ffmpeg_stderr.txt
drwxrwxr-x 2 pagrubel pagrubel 4096 Aug 21 15:03 graphics_output
-rw-rw-r-- 1 pagrubel pagrubel 3215 Aug 18 16:49 lorem.txt
-rw-r--r-- 1 pagrubel pagrubel 302 Aug 21 13:52 occur0.txt
-rw-r--r-- 1 pagrubel pagrubel 229 Aug 21 13:52 occur1.txt
-rw-rw-r-- 1 pagrubel pagrubel 10240 Aug 21 13:53 out.tgz
-rw-rw-r-- 1 pagrubel pagrubel 64 Aug 21 15:03 total_execution_time.log
(hpc-beeflow-YDRVf3zF-py3.9) (base) pagrubel@darwin-fe1 beeworkdir2$ beeflow submit clamrb clamr-ffmpeg-build clamr_wf.cwl clamr_job.yml ~/beeworkdir2
Detected directory instead of packaged workflow. Packaging Directory...
Traceback (most recent call last):
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/schema_salad/fetcher.py", line 98, in fetch_text
with open(urllib.request.url2pathname(str(path)), encoding="utf-8") as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/vast/home/pagrubel/beeworkdir2/clamr.cwl'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/bin/beeflow", line 6, in <module>
sys.exit(main())
File "/vast/home/pagrubel/BEE/BEE/beeflow/client/bee_client.py", line 543, in main
app()
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/typer/main.py", line 289, in __call__
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/typer/main.py", line 280, in __call__
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/click/core.py", line 1078, in main
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/click/core.py", line 783, in invoke
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/typer/main.py", line 607, in wrapper
File "/vast/home/pagrubel/BEE/BEE/beeflow/client/bee_client.py", line 206, in submit
workflow, tasks = parser.parse_workflow(workflow_id, str(main_cwl_path),
File "/vast/home/pagrubel/BEE/BEE/beeflow/common/parser/parser.py", line 120, in parse_workflow
tasks = [self.parse_step(step, workflow_id) for step in self.cwl.steps]
File "/vast/home/pagrubel/BEE/BEE/beeflow/common/parser/parser.py", line 120, in <listcomp>
tasks = [self.parse_step(step, workflow_id) for step in self.cwl.steps]
File "/vast/home/pagrubel/BEE/BEE/beeflow/common/parser/parser.py", line 139, in parse_step
step_cwl = cwl_parser.load_document(step_run)
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/cwl_utils/parser/cwl_v1_2.py", line 15494, in load_document
return _document_load(
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/cwl_utils/parser/cwl_v1_2.py", line 605, in _document_load
return _document_load_by_url(
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/cwl_utils/parser/cwl_v1_2.py", line 637, in _document_load_by_url
text = loadingOptions.fetcher.fetch_text(url)
File "/vast/home/pagrubel/.cache/pypoetry/virtualenvs/hpc-beeflow-YDRVf3zF-py3.9/lib/python3.9/site-packages/schema_salad/fetcher.py", line 103, in fetch_text
raise ValidationException(str(err)) from err
schema_salad.exceptions.ValidationException: [Errno 2] No such file or directory: '/vast/home/pagrubel/beeworkdir2/clamr.cwl'
Selecting a different yml file does work:
beeflow submit clamrb clamr-ffmpeg-build clamr-ffmpeg-build/clamr_wf.cwl clamr_job.yml ~/beeworkdir2
Detected directory instead of packaged workflow. Packaging Directory...
Package clamr-ffmpeg-build.tgz created successfully
Workflow submitted! Your workflow id is b183d0.
I changed the time_steps to 500 in the yml file and got the expected results, less files in graphics_output and smaller movie and verified the yml file in ~/.beeflow/workflows file
So the use case should be: If the user wants to use a main cwl or yml, different than what is in the workflow dir, it should be copied to the temporary workflie and should end up in the archive.
I looked at this a bit more. The problem with just trying to use a different main cwl is that the entirer CWL specification is parsed before the temporary dir is made with the new cwl main so the other files are missing. We need to discuss if we want all the cwl files in the dir with the alternate main cwl. If so this will work and we just need to modify the documentation. parsing order
Addressed in #743
We need to clarify how this works. In the documentation it says:
Additionally, if the main_cwl and yaml files are not in the workflow directory, they will be copied into a temporary copy of the workflow directory before packaging. Compare this with the previous example.
When I tried to use a main_cwl file and a yml file that isn't in the directory specified on the submit line, it couldn't find the step cwl files. I am going to change the documentation so the example works, but we should decide what is happening here and show a use case.