fractal-analytics-platform / fractal-tasks-core

Main tasks for the Fractal analytics platform
https://fractal-analytics-platform.github.io/fractal-tasks-core/
BSD 3-Clause "New" or "Revised" License
14 stars 6 forks source link

New place for default meta arguments? #381

Closed jluethi closed 4 months ago

jluethi commented 1 year ago

Following yesterday's discussion that we can probably drop the separate default_args from task if the task schemas come with default arguments themselves and that those task default arguments will be defined in the actual python function:

Where do meta arguments go? Things like:

"parallelization_level": "image",
"cpus_per_task": 1,
"mem": 4000
jluethi commented 1 year ago

Also, is there a schema for the possible meta arguments as well?

tcompa commented 1 year ago

Also, is there a schema for the possible meta arguments as well?

I wrote it in https://github.com/fractal-analytics-platform/fractal-web/issues/157#issuecomment-1576340807, but it should have been here:

I think we never mentioned JSON schemas for the meta arguments, yet. Of course they would be useful, and it should be doable (it's essentially the same as for the standard args), but we should first make an effort to define what's the allowed structure - and how it varies depending on the FRACTAL_BACKEND (e.g. is it a single schema that is general for all backends? or do we have a schema for each backend?). We should also understand where are these schemas defined (e.g. the SLURM configuration is fully handled in fractal-server, so that's the most likely place where we would need to create the schemas and keep them up-to-date. But then should we somehow validate the defaults of a FRACTAL_MANIFEST in the task repo?). I think this issue is still to be explored further, before any implementation.

tcompa commented 1 year ago

Where do meta arguments go?

In the current state, a Task object in fractal-server has two different attributes that are args and meta. The ones you mentioned (parallelization_level, mem, ..) enter meta. Their defaults are defined in the manifest (for instance), and can be overridden when a Task is included into a Workflow in the form of a WorkflowTask.

Thus the current situation is that we would still have meta in the tasks' manifest, without changes. We can re-discuss it, especially in view of the question on meta schemas.

Let me know if this helps clarifying, or maybe I misunderstand the question.

jluethi commented 1 year ago

Thus the current situation is that we would still have meta in the tasks' manifest, without changes. We can re-discuss it, especially in view of the question on meta schemas.

Fine by me as a start. Let's finish the args schemas first and start thinking about the meta schemas.

One of the things we're getting with arg schemas is that we can always see the defaults of a task in the workflow json afterwards, which is great for transparency & reproducibility. We don't currently have that for meta arguments (=> it's not obvious for a user how many resources a task will request if it's overwriting the server defaults [which currently, all of our tasks do]). One way to get there would be via their own schemas.