Open freeseek opened 2 years ago
Hello,
I think the problem is solved in release 78 of Cromwell. I had this problem when running the mocha workflow at Cromwell server 74. After updating to 78 the workflow completed the problematic tasks.
+--------------------+---------+------------+---------------------+
| TASK | ATTEMPT | ELAPSED | STATUS |
+--------------------+---------+------------+---------------------+
| batch_id_lines | 1 | 5m34.003s | Done |
| batch_sorted_tsv | 1 | 4m45.648s | Done |
| csv2bam (Scatter) | - | 10m51.838s | 1/1 Done | 0 Failed |
| green_idat_lines | 1 | 5m34.003s | Done |
| gtc | 1 | 5m27.897s | Done |
| gtc_reheader | 1 | 5m26.257s | Failed |
| idat | 1 | 5m27.897s | Done |
| idat2gtc (Scatter) | - | 10m58.206s | 0/1 Done | 1 Failed |
| red_idat_lines | 1 | 5m34.002s | Done |
| ref_scatter | 1 | 4m39.394s | Done |
| sample_id_lines | 1 | 5m34.003s | Done |
| sample_sorted_tsv | 1 | 4m42.453s | Done |
+--------------------+---------+------------+---------------------+
❗You have 1 issue:
- Workflow failed
- GCS output file not found: gs://bioinfo-dev-temp/mocha/a224bb3e-fc20-4b0a-8846-ee2b4b603933/call-gtc_reheader/maps
- GCS output file not found: gs://bioinfo-dev-temp/mocha/a224bb3e-fc20-4b0a-8846-ee2b4b603933/call-idat2gtc/shard-0/gtcs
+----------------------------+---------+-----------------+-----------------------+
| TASK | ATTEMPT | ELAPSED | STATUS |
+----------------------------+---------+-----------------+-----------------------+
| batch_id_lines | 1 | 16.37s | Done |
| batch_sorted_tsv | 1 | 15.288s | Done |
| call_rate_lines | 1 | 5m34.525s | Done |
| computed_gender_lines | 1 | 5m34.523s | Done |
| csv2bam (Scatter) | - | 49.958s | 1/1 Done | 0 Failed |
| flatten_sample_id_lines | 1 | 5m29.56s | Done |
| get_max_nrecords (Scatter) | - | 5m32.076s | 1/1 Done | 0 Failed |
| green_idat_lines | 1 | 16.38s | Done |
| green_idat_tsv | 1 | 5m33.602s | Done |
| gtc | 1 | 10.602s | Done |
| gtc2vcf (Scatter) | - | 8m15.392s | 1/1 Done | 0 Failed |
| gtc_reheader | 1 | 4m16.907s | Done |
| gtc_tsv | 1 | 5m30.578s | Done |
| idat | 1 | 7.606s | Done |
| idat2gtc (Scatter) | - | 9m46.928s | 1/1 Done | 0 Failed |
| mocha_calls_tsv | 1 | 5m19.305941005s | Running |
| mocha_stats_tsv | 1 | 5m19.304938136s | Running |
| red_idat_lines | 1 | 16.386s | Done |
| red_idat_tsv | 1 | 5m33.603s | Done |
| ref_scatter | 1 | 17.728s | Done |
| sample_id_lines | 1 | 16.383s | Done |
| sample_id_split_tsv | 1 | 5m31.462s | Done |
| sample_sorted_tsv | 1 | 11.924s | Done |
| sample_tsv | 1 | 5m26.14s | Done |
| vcf_concat (Scatter) | - | 5m32.467s | 1/1 Done | 0 Failed |
| vcf_import (Scatter) | - | 8m16.609s | 1/1 Done | 0 Failed |
| vcf_merge (Scatter) | - | 2h6m53.926s | 23/23 Done | 0 Failed |
| vcf_mocha (Scatter) | - | 8m19.96s | 1/1 Done | 0 Failed |
| vcf_phase (Scatter) | - | 3h7m39.033s | 23/23 Done | 0 Failed |
| vcf_qc (Scatter) | - | 2h8m6.051s | 23/23 Done | 0 Failed |
| vcf_scatter (Scatter) | - | 5m25.444s | 1/1 Done | 0 Failed |
| vcf_split (Scatter) | - | 2h7m37.183s | 23/23 Done | 0 Failed |
| write_tsv | 1 | 5m10.124926865s | Running |
| xcl_vcf_concat | 1 | 5m28.883s | Done |
+----------------------------+---------+-----------------+-----------------------+
note: some tasks has duration of few seconds because I'm using call cache.
This workflow when run on Google Cloud using Cromwell 74:
will succeed.
When run on Google Cloud using Cromwell 75:
the workflow will fail with message:
However, the directory is correctly delocalized:
The delocalization script is aware that
d
is directory:But somehow a new check was included in Cromwell 75 that wants
d
to be a file even if it is delocalized as a directory.This breaks the only workaround available in Cromwell to be able to delocalize a list of files not determined a priori before the start of the task. Notice that
glob()
is not an acceptable alternative asglob()
does not provide control over the order of the output files.