Closed nzxwang closed 4 years ago
Hi nzxwang,
Do these failed stages still use CPU (i.e. are in the 'R' state when looking in top)??
I have a mincresample in my job that is seeming to take a while to process (see image below) and what to know whether the stage has failed or is still running.
Kind Regards, Kyle Drover
Hi nzxwang,
Do these failed stages still use CPU (i.e. are in the 'R' state when looking in top)??
I have a mincresample in my job that is seeming to take a while to process (see image below) and what to know whether the stage has failed or is still running.
Kind Regards, Kyle Drover
That certainly looks like its runnig since it has 99.8% of a CPU. To the best of my knowledge, mincresample's exit codes are well understood by pydpiper.
@nzxwang - Pydpiper has a certain convention for stage return codes (admittedly not strictly enforced due to lack of a type system) but if your OpenCV-based stage is doing something different it'd be better to wrap it in some sort of adapter rather than generalize Pydpiper's logic, so closing for now. (Not sure why this causes a hang though.) Feel free to the code in question if you think that'd clarify.
Currently stages hang if the return code is False and the expected file is not produced by the stage.
fastCell uses opencv-python's cv.imwrite() function to write out its segmentations. If there is no disk space for writing out, the function will print the underlying error and return False, but the stage hangs because it doesn't recognize False and waits for the output file.