Prepare script before copying input files

Sanaz01 commented 2 months ago

I am using WDL workflows on PROOF. A good portion of errors are resolved by looking at execution/script file, however this file is generated after copying all input files in input/. Can the order be reversed? It would save a lot of time in debugging.

sckott commented 2 months ago

Can you say more @Sanaz01 ? I'm not understanding fully what you mean. What is execution and script? And the order of what reversed?

Sanaz01 commented 2 months ago

Sure @sckott. As an example, when running PROOF, a shard-0 directory is generated as /cromwell-scratch/workflow_name/workflow_id/call-task_name/shard-0 with two sub-directories: execution and inputs. First, all files needed to run the task are copied to inputs, for each call-task_name and shard value (same file copied to multiple directories). Thereafter, execution/script file is generated that contains the entire script required to run the task (with input paths locally referenced to inputs dir). To check if the path and static variables have been passed correctly in the script, we have to wait for all large input files to be moved to inputs dir first. This leads to

unnecessary memory wastage if the run was not successful
time delay before you know that the script had error

Possible suggestion: generate paths to copy input files, generate script file with new paths, copy files to new path

sckott commented 2 months ago

Thanks @Sanaz01 I don't think this is a proof thing, Seems more like a Cromwell thing. @sitapriyamoorthi Can you shed some light on this?

sitapriyamoorthi commented 2 months ago

@sckott and @Sanaz01 I believe this is a Cromwell thing. It maybe possible to not localize your inputs based on these docs https://cromwell.readthedocs.io/en/latest/optimizations/FileLocalization/. However having said that it might depends how the tasks are actually run and if they are being accessed by other tasks. And how Cromwell has been configured on the HPC @dtenenba or @vortexing might be able to shed some more light on this

FredHutch / shiny-cromwell

Prepare script before copying input files #126