Problem: after a batch job completes, we lose the KVS content and hence all the information about what ran in the the instance.
We do have the flux batch --dump option
--dump=[FILE]
When the job script is complete, archive the Flux instance's KVS
content to FILE, which should have a suffix known to
libarchive(3), and may be a mustache template as described above
for --output. The content may be unarchived directly or exam‐
ined within a test instance started with the flux-start --recov‐
ery option. If FILE is unspecified, flux-{{jobid}}-dump.tgz is
used.
Should we consider enabling this by default?
We could consider garbage collecting it before writing it out.
Also maybe some tooling beyond flux start --recovery could be developed to allow the dump file to be queried.
Problem: after a batch job completes, we lose the KVS content and hence all the information about what ran in the the instance.
We do have the
flux batch --dump
optionShould we consider enabling this by default?
We could consider garbage collecting it before writing it out.
Also maybe some tooling beyond
flux start --recovery
could be developed to allow the dump file to be queried.