Open AymenFJA opened 1 year ago
I don't think there is any assumption that Raptor functions are lightweight, is there?
A lot of tools have Python interfaces that we would like to access through Raptor, and a lot of these tools either inspect the working directory or have filesystem side effects. (e.g. lammps)
Making task sandboxes optional seems reasonable, but the option needs to be well-documented. I think the current behavior is consistent across rp.Task
objects. Also, as I understand it, there is nothing preventing a rp.Task
from using a pre-existing sandbox directory, right? Maybe a self-consistent resolution without code changes would be to suggest that Raptor tasks could be submitted with "sandbox": worker.task_sandbox
or something to explicitly re-use the worker sandbox.
Also: please note that it is important that any default behavior for generating a task_sandbox
should be easily discoverable at the site of task_manager.submit()
This was discussed on the devel call, several options are on the table:
1) configure a raptor worker so that tasks either all get a sandbox, or don't. Configuration would be done via worker description.
2) enable raptor task sandboxes only when a sandbox is explicitly set in the task description
3) add a flag to the task description to suppress sandbox creation. that flag would default to True
for function tasks
There was previously a significant initiative to make raptor tasks more like traditional tasks. Maybe that was more of a code clean-up effort than a design goal, though. Is there a recognition that raptor tasks have notable differences from traditional tasks that warrant different default values for TaskDescription fields? Will there be a distinct type to represent the different behavior?
This was discussed on the devel call, several options are on the table
Maybe it came up on the call, but can you comment on whether it is feasible and reasonable to just re-use a raptor sandbox for the tasks (either manually or automatically)?
Maybe it came up on the call, but can you comment on whether it is feasible and reasonable to just re-use a raptor sandbox for the tasks (either manually or automatically)?
You can always specify an existing directory as task sandbox. You can, for example, specify a raptor master or worker sandbox - in that sense you would enforce that sandbox to be reused for the tasks. That would then mimic the old behavior before we introduced sandboxes for raptor tasks.
There was previously a significant initiative to make raptor tasks more like traditional tasks. Maybe that was more of a code clean-up effort than a design goal, though.
It is a design goal, but we are not religious about it.
You can always specify an existing directory as task sandbox. You can, for example, specify a raptor master or worker sandbox - in that sense you would enforce that sandbox to be reused for the tasks. That would then mimic the old behavior before we introduced sandboxes for raptor tasks.
Would this be an option to resolve this issue?
It resolves Ayman's use case all right, but the discussion now focuses on the default behavior.
It resolves Ayman's use case all right, but the discussion now focuses on the default behavior.
My preference would be to have consistent default behavior for objects of the same formal type.
I think it would be okay to use rp.Task
in both cases if and only if the task_sandbox
value could be populated before the object is returned by submit
. I am highly skeptical of TaskDescription fields that have different interpretations or different default behaviors depending on other fields.
Subclasses or rigorous schema documentation are reasonable mitigating strategies if the unified rp.TaskDescription is retained.
In any case, please warn me when behavior changes hit devel
(either in direct messaging or through an issue at https://github.com/SCALE-MS/scale-ms/issues)
Currently, RAPTOR functions are generating empty tasks sandbox on the agent side. As far as I understand these are lightweight functions and they should not generate a sandbox at all. One way to fix this confusion is that we can use the sandbox option in
TaskDescription
such as if specified by the user then it would be created. If not then nothing would be generated.I imagine if we have 1M functions that would generate 1M empty directories which is an absolute I/O bottleneck but more importantly not required.