Open milroy opened 4 years ago
The subgraph container will need to be chosen with removal performance in mind.
Upon further inspection, I don't think rem_dfv
requires overloading or modification. The "root" vertex vtx_t u
can be considered the root of the subgraph to be removed. It can be a subgraph of the job's resources and the functionality doesn't have to change.
However, this means there needs to be handling of the various root types passed to the dfu_impl_t::rem_dfv
call. I think the overloading needs to occur here: https://github.com/flux-framework/flux-sched/blob/af4447ffd27400cc33a908c62201380a98075571/resource/traversers/dfu.cpp#L278
which in turn will require modification here: https://github.com/flux-framework/flux-sched/blob/af4447ffd27400cc33a908c62201380a98075571/resource/modules/resource_match.cpp#L712
to handle removal at a specific vertex of the job's resources.
Since run_remove
is called by cancel_request_cb
and this is not a job cancellation, we probably need a new method in DFU.
Note that since the functionality proposed here is conceptually more aligned with a job cancellation, modifying a job cancellation would be a possible direction. However, in the future deallocation will need to detach resources within the job's resource graph (rather than the parent or top level's).
With that in mind I think deallocation by jobid and vertex subset fits best within graph detach (issue #554).
Doing some triage from the bottom of the stack, pretty sure this is partial release and covered by your recent work right @milroy?
To permit deallocation of a subgraph of a running job's resources
dfu_impl_t
will need to be overloaded: https://github.com/flux-framework/flux-sched/blob/af4447ffd27400cc33a908c62201380a98075571/resource/traversers/dfu_impl_update.cpp#L386It will need to take a third argument (vector, associative container, etc.) that represents the subgraph to be removed.