Open tiborsimko opened 7 years ago
yadage already has infrastructure in place to cache jobs, with pluggable mechanisms on how to validate/update the cache, so we could develop a custom cache plugin for reana. The main issue is how to deal with changing absolute paths to e.g. input files, etc.
@tiborsimko should we try to spec out what a engine-independent cache would look like?
@lukasheinrich Yes, let's! The best after v0.5.0 is over.
It'll be useful (someday) to have a central server-side job cache that could speed up the rerunning of workflows. (And for sharing results among workflows that start with the same initial steps.)
Each job execution run could optionally store results under:
/reana/jobs/:someid/input/...
containing say SHA1 information about the input file and parameters, the container environment used, the steps used, and that would store any desired output files of the step command there.
If another job comes and uses the same environment image and the same input file and parameters, then the job execution could quickly return the pre-cached job output.
This could be implemented at the level of the workflow.