Currently, if the prepare stage is completed for a set of Records, but it hasn't been uploaded yet (i.e is_uploaded == False), then the pipeline will still try to execute the prepare stage.
I'd suggest to introduce a quick check if the records are already prepared, and if so, skip the prepare stage; similar to how this is done in the preprocess stage.
This will throw only when the server is being halted in between the prepare and upload stage (which occur sequentially), thus very unlikely. But I ran the pipeline locally with the upload stage uncommented - which did result in this error. More importantly, it kills the overall process completely.
Currently, if the prepare stage is completed for a set of
Records
, but it hasn't been uploaded yet (i.eis_uploaded == False
), then the pipeline will still try to execute the prepare stage.I'd suggest to introduce a quick check if the records are already prepared, and if so, skip the prepare stage; similar to how this is done in the
preprocess
stage.This will throw only when the server is being halted in between the prepare and upload stage (which occur sequentially), thus very unlikely. But I ran the pipeline locally with the
upload
stage uncommented - which did result in this error. More importantly, it kills the overall process completely.