CarragherLab / cptools2

Running cellprofiler on eddie3 / SGE clusters
4 stars 2 forks source link

Jobs are failing due to missing data #4

Closed Swarchal closed 7 years ago

Swarchal commented 7 years ago

Jobs are failing due to destaging array job removing the image directory whilst the analysis job is still trying to access the image data.

This could either be caused by destaging deleting:

Not sure how it could be deleting the correct directory too early unless there's a bug in SGE's hold_jid_ad which is unlikely.

Swarchal commented 7 years ago

Might not be the destaging, rather the analysis jobs are failing because the data is not actually there -- so the staging jobs are not completely transferring the data. This might be because of too short a run-time limit.

Swarchal commented 7 years ago

Caused by staging jobs being killed by the run-time limit before they have finished.