scalems.call on raptor backend

eirrgang commented 1 year ago

This issue tracks the scalems package aspect of an issue in the workshop repository.

[x] Make sure scalems.call can target the raptor backend in support of https://github.com/SCALE-MS/workshop/issues/8. If raptor is enabled, add scheduler and TASK_EXECUTABLE to the TaskDescription.
~~Delay launch of the Worker (until inspecting the work load) so that cores are available for the TASK_EXECUTABLE task.~~ deferred
[x] Understand and handle environment variable replacement/updates. See #327 (need input from @andre-merzky)
[x] Make sure that we understand how to reconcile user-provided task resource requirements with the Master and Workers that we provision. (I'm going to need some advice from @andre-merzky on this part)

eirrgang commented 1 year ago

Worker._dispatch_proc() is broken in RP 1.21.0, so a lateral move is not possible using the TASK_PROC mode. I will discuss options with @andre-merzky and @mturilli today.

update

the issue is resolved
we won't be using TASK_PROC

eirrgang commented 1 year ago

Additional constraints on resource allocation

Pending further discussion (#302), we leave it as an exercise to the user to provision a Pilot that is adequate for the tasks to be submitted. Dispatching through raptor has a slight additional burden and warrants some updates to the scalems raptor lifetime management.

By the time the scalems.call.function_call_to_subprocess() call is made, the Worker(s) may have already started. By the time scalems.radical.runtime.subprocess_to_rp_task() executes, the Worker(s) has definitely started. We need to split up the Worker launch from the Master launch and inspect the work load to decide how to provision the Worker(s).

As a first step, though, to facilitate the lateral move of scalems.call, we can provision one Worker with N-1 cores and raise an error if the submitted Task is incompatible.

The follow-up should rely on the new raptor protocol that @andre-merzky is working on, if at all possible, to manage Workers, or @eirrgang will be performing completely redundant work that is immediately obsolete.

The biggest short-term impact will be lack of flexibility with cores allocated to (OpenMP) threads versus ranks.

Additional notes from design discussion

Resource constraints:

the Master uses one of the cores available to the Pilot, so it is not available to Tasks.
Tasks cannot span Workers, so we need to make sure that we provision a sufficiently large Worker.
raptor does not support the memory/disk task constraints.
GPUs: gpus-per-rank is not well explored in raptor. deviations unknown.
nodes: Workers may span nodes, so this shouldn't be a problem.

Other set-up details:

pre_exec needs to happen on the Worker, not the Task (default scalems pre_exec is already handled this way. We can add a check that user has not extended it until we can update the Worker provisioning.).
For 0th step, we can provision one Worker with all resources.
For immediate follow-up: Worker provisioning needs to be delayed, and carried out with respect to the work load.

eirrgang commented 1 year ago

scalems.call was a workaround that wrapped a serialized function call into a command line executable task for dispatching through traditional RP executable Task execution. This was pursued to give us a chance to move forward with other development while refining raptor.

There does not appear to be a good way to simply port scalems.call to raptor. We don't have to disable scalems.call completely, but we cannot simply dispatch the same workflow script to be executed on raptor. The function_call_to_subprocess() sequence of calls just don't make sense in the raptor context.

eirrgang commented 1 year ago

update: We should be able to salvage this with TASK_EXECUTABLE mode. The raptor master should be able to manage such a task without a worker.

SCALE-MS / scale-ms

scalems.call on raptor backend #326

Additional constraints on resource allocation

Additional notes from design discussion