common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 229 forks source link

location url not executed correctly with --cachedir CACHE #1842

Open jjkoehorst opened 1 year ago

jjkoehorst commented 1 year ago

Expected Behavior

When executing a yaml file with cwltool --cachedir CACHE WORKFLOW.cwl the input file with

cwl:tool: ../../workflows/workflow_ngtax.cwl
for_read_len: 100
forward_primer: '[AG]GGATTAGATACCC'
forward_reads: 
   - class: File
     location: http://download.systemsbiology.nl/unlock/cwl/test_data/amplicon/forward.fastq.gz
memory: 6000
minimum_threshold: 0.1
reference_db: 
   class: File
   location: /unlock/references/databases/Silva/SILVA_138.1_SSURef_tax_silva.fasta.gz
rev_read_len: 100
fragment: V3
reverse_primer: CGAC[AG][AG]CCATGCA[ACGT]CACCT
reverse_reads: 
   class: File
   location: http://download.systemsbiology.nl/unlock/cwl/test_data/amplicon/reverse.fastq.gz
sample: UNLOCK_NGTAX_TEST
primersRemoved: true

Results in an

INFO [workflow ] start
INFO [workflow ] starting step prepare_fasta_db
INFO [step prepare_fasta_db] start
INFO [workflow prepare_fasta_db] start
INFO [workflow prepare_fasta_db] starting step prepare_fasta_db_2
INFO [step prepare_fasta_db_2] start
ERROR Unexpected exception
Traceback (most recent call last):
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/workflow.py", line 459, in job
    yield from self.embedded_tool.job(
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/command_line_tool.py", line 835, in job
    visit_class([cachebuilder.files, cachebuilder.bindings], ("File"), _checksum)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 216, in visit_class
    visit_class(d, cls, op)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 216, in visit_class
    visit_class(d, cls, op)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 211, in visit_class
    op(rec)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/process.py", line 1348, in compute_checksums
    with fs_access.open(location, "rb") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/stdfsaccess.py", line 38, in open
    return open(self._abs(fn), mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'http://download.systemsbiology.nl/unlock/cwl/test_data/human_small.fa.gz'
ERROR [step prepare_fasta_db_2] Cannot make job: [Errno 2] No such file or directory: 'http://download.systemsbiology.nl/unlock/cwl/test_data/human_small.fa.gz'
INFO [workflow prepare_fasta_db] completed permanentFail
WARNING [step prepare_fasta_db] completed permanentFail
INFO [workflow ] completed permanentFail
{
    "filtered_reads": null,
    "filtlong_log": null,
    "kraken2_folder": null,
    "nanoplot_filtered_folder": null,
    "nanoplot_unfiltered_folder": null,
    "reference_filter_longreads_log": null
}WARNING Final process status is permanentFail

Actual Behavior

To be able to download the file and execute the workflow

Workflow Code


https://workflowhub.eu/workflows/45

Full Traceback

INFO [workflow ] start
INFO [workflow ] starting step prepare_fasta_db
INFO [step prepare_fasta_db] start
INFO [workflow prepare_fasta_db] start
INFO [workflow prepare_fasta_db] starting step prepare_fasta_db_2
INFO [step prepare_fasta_db_2] start
ERROR Unexpected exception
Traceback (most recent call last):
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/workflow.py", line 459, in job
    yield from self.embedded_tool.job(
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/command_line_tool.py", line 835, in job
    visit_class([cachebuilder.files, cachebuilder.bindings], ("File"), _checksum)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 216, in visit_class
    visit_class(d, cls, op)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 216, in visit_class
    visit_class(d, cls, op)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/utils.py", line 211, in visit_class
    op(rec)
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/process.py", line 1348, in compute_checksums
    with fs_access.open(location, "rb") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Git/m-unlock/cwl/venv/lib/python3.11/site-packages/cwltool/stdfsaccess.py", line 38, in open
    return open(self._abs(fn), mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'http://download.systemsbiology.nl/unlock/cwl/test_data/human_small.fa.gz'
ERROR [step prepare_fasta_db_2] Cannot make job: [Errno 2] No such file or directory: 'http://download.systemsbiology.nl/unlock/cwl/test_data/human_small.fa.gz'
INFO [workflow prepare_fasta_db] completed permanentFail
WARNING [step prepare_fasta_db] completed permanentFail
INFO [workflow ] completed permanentFail
{
    "filtered_reads": null,
    "filtlong_log": null,
    "kraken2_folder": null,
    "nanoplot_filtered_folder": null,
    "nanoplot_unfiltered_folder": null,
    "reference_filter_longreads_log": null
}WARNING Final process status is permanentFail

Your Environment

jjkoehorst commented 1 year ago

Might be identical to https://github.com/common-workflow-language/cwltool/issues/1828 ?

vedran-kasalica commented 3 months ago

I am having the same issue with cwltool 3.1.20240508115724

[2024-05-17 11:03:12] ERROR Unexpected exception
Traceback (most recent call last):
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/workflow.py", line 461, in job
    yield from self.embedded_tool.job(
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/command_line_tool.py", line 838, in job
    visit_class([cachebuilder.files, cachebuilder.bindings], ("File"), _checksum)
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/utils.py", line 218, in visit_class
    visit_class(d, cls, op)
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/utils.py", line 218, in visit_class
    visit_class(d, cls, op)
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/utils.py", line 213, in visit_class
    op(rec)
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/process.py", line 1352, in compute_checksums
    with fs_access.open(location, "rb") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "home/workflomics_benchmarker/.venv/lib/python3.11/site-packages/cwltool/stdfsaccess.py", line 38, in open
    return open(self._abs(fn), mode)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'https://raw.githubusercontent.com/Workflomics/DemoKit/main/data/inputs/2021-10-8_Ecoli.mzML'
[2024-05-17 11:03:12] ERROR [step Comet_01] Cannot make job: [Errno 2] No such file or directory: 'https://raw.githubusercontent.com/Workflomics/DemoKit/main/data/inputs/2021-10-8_Ecoli.mzML'
[2024-05-17 11:03:12] INFO [workflow ] completed permanentFail
{
    "output_1": null,
    "output_2": null,
    "output_3": null,
    "output_4": null
}[2024-05-17 11:03:12] WARNING Final process status is permanentFail

The URL is specified under location, see comet cwl description