Kitware / HPCCloud

A Cloud/Web-Based Simulation Environment
https://kitware.github.io/HPCCloud/
Apache License 2.0
50 stars 23 forks source link

PyFR - add a retry to EC2 job commands. #630

Closed aronhelser closed 5 years ago

aronhelser commented 5 years ago

PyFR submissions are always failing on the first (or first few) runs. On Amazon EC2 clusters, add a retry that looks for the second output file to be produced. Retries will be visible in job output and error logs.

@patrickoleary @cjh1

codecov-io commented 5 years ago

Codecov Report

Merging #630 into master will increase coverage by 0.03%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #630      +/-   ##
==========================================
+ Coverage   57.09%   57.12%   +0.03%     
==========================================
  Files         135      135              
  Lines        6202     6202              
==========================================
+ Hits         3541     3543       +2     
+ Misses       2661     2659       -2
Impacted Files Coverage Δ
src/redux/actions/groups.js 93.49% <0%> (+0.81%) :arrow_up:
src/network/remote/group.js 13.51% <0%> (+2.7%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 034f99f...cc6a772. Read the comment docs.

aronhelser commented 5 years ago

Need to find a root cause.