Closed nsheff closed 2 months ago
Well, seems useful in this case, but is it more generally useful? The thing is we're usually using looper to submit bigger jobs, and not to run something directly. So it's almost a way of retrieving the pipeline output, but generally looper is submitting jobs and not aware of their output... this would be something in between, it's like a checker that could be used to make sure something happened, or something.
As I'm reading this again, my main thought is...shouldn't the pipeline be handling this? And, if using pypiper or pipestat, shouldn't the pipeline report a success or failure for Looper to check?
We won't solve this issue specifically. Instead we will steer users to take advantage of pipestat integration. The user should do this check in their pipeline and have pipestat set flags (success, failed, completed) appropriately. Then, the user can use looper check
to summarize their results.
We recently added pre-submission hooks (#285). What about post-submission hooks to check on something?
In #289 @afrendeiro brought up the idea that we need some more powerful checks on submission.
Well, in the refgenomes submission, the first step is to use looper to download files... and then check their checksums; if the checksums don't match, we delete the file. This all happens in the wget piface.
https://github.com/refgenie/plantref/blob/master/pipeline_interfaces/wget_piface.yaml
Right now the code is this:
Unfortunately, looper isn't really aware of this and can't report these as a failure. we've really written a little script there into the piface. So, even if some of these downloads fail, I still see:
What if we could allow a post-submit script execution? looper would execute this, and now it's the script that would run this checksum check, and looper would be aware of the results. this accomplishes 2 things better than the current system:
{submit_status: "success"}
or something... so looper could interpret this json to provide results to the user at the end of the submissions.maybe we add a new namespace for job status? or just add it to the looper namespace, then teach looper to use this in the epilogue?
Well, seems useful in this case, but is it more generally useful? The thing is we're usually using looper to submit bigger jobs, and not to run something directly. So it's almost a way of retrieving the pipeline output, but generally looper is submitting jobs and not aware of their output... this would be something in between, it's like a checker that could be used to make sure something happened, or something.