reanahub / reana-demo-worldpopulation

REANA example - parametrised Jupyter notebooks
MIT License
3 stars 36 forks source link

investigate worldpopulation-cwl-htc failure #48

Open tiborsimko opened 3 years ago

tiborsimko commented 3 years ago

Seen with 0.8.0-alpha.3:

$ reana-client status -w worldpopulation-cwl-htc.1
NAME                      RUN_NUMBER   CREATED               STARTED               ENDED                 STATUS   PROGRESS
worldpopulation-cwl-htc   1            2021-10-14T16:13:08   2021-10-14T19:21:31   2021-10-14T19:23:37   failed   1/1     

$ reana-client logs -w worldpopulation-cwl-htc.1
...
ERROR [step worldpopulation] Output is missing expected field file:///var/mnt/de7c46db-6f3d-4a72-af37-d889f615c20f/workflow.json#main/worldpopulation/plot
2021-10-14 19:23:36,811 | cwltool | worldpopulation | ERROR | [step worldpopulation] Output is missing expected field file:///var/mnt/de7c46db-6f3d-4a72-af37-d889f615c20f/workflow.json#main/worldpopulation/plot
WARNING [step worldpopulation] completed permanentFail
...
Input Notebook:  /pool/condor/dir_22577/cwl/docker_stagedir/stg69a93fb5-cffa-4a24-9a24-3b973f76cf5d/worldpopulation.ipynb
Output Notebook: /dev/null
Generating grammar tables from /usr/local/lib/python3.6/site-packages/blib2to3/Grammar.txt
Writing grammar tables to /pool/condor/dir_22577/cwl/outdir/_2_nonjw/.cache/black/21.7b0/Grammar3.6.8.final.0.pickle
Writing failed: [Errno 2] No such file or directory: '/pool/condor/dir_22577/cwl/outdir/_2_nonjw/.cache/black/21.7b0/tmpu_0kydtj'
Generating grammar tables from /usr/local/lib/python3.6/site-packages/blib2to3/PatternGrammar.txt
Writing grammar tables to /pool/condor/dir_22577/cwl/outdir/_2_nonjw/.cache/black/21.7b0/PatternGrammar3.6.8.final.0.pickle
Writing failed: [Errno 2] No such file or directory: '/pool/condor/dir_22577/cwl/outdir/_2_nonjw/.cache/black/21.7b0/tmp1otqq618'
tiborsimko commented 3 years ago

The retry worked well:

$ reana-client ls -w worldpopulation-cwl-htc.2 | grep png
outputs/plot.png                12838   2021-10-15T08:37:01

So this may have been a transient HTC issue.

Nothing hard to do, except perhaps to look whether we could catch these sort of problems and inform the user about retrying later...