gatk-workflows / five-dollar-genome-analysis-pipeline

Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline
https://gatk.broadinstitute.org/hc/en-us
BSD 3-Clause "New" or "Revised" License
57 stars 45 forks source link

sleep after creating files #26

Closed digrigor closed 4 years ago

digrigor commented 4 years ago

Hi,

I think it would be a good idea to add sleep commands after the parts that the pipeline is creating text or other tsv etc. files and then the pipeline has to instantly open them,

For example, in the GetBwaVersion task when you are creating the txt file sed 's/Version: //' > bwa_version.txt;

some filesystems are not fast enough to instantly access the file (in this occasion it is instantly called by the read_string("bwa_version.txt")) and causes a workflow failure.

The error I was getting was the: IOException: Could not read from ...

So by adding a sleep time the system is actually ready to read the file and my workflow is running smoothly: sed 's/Version: //' > bwa_version.txt; sleep 5

I was facing the issue for a long time. I googled it and it seems that there are many people facing similar issues and this fix could be the solution for them too.

Best, Dionysis

vdauwera commented 4 years ago

@digrigor, sorry for the lag and thanks for commenting. That's an interesting problem, I don't think I've heard about something like this before. I don't envision that we could modify our production pipeline that way based on this, but feel free to modify your local version accordingly.