Closed mturilli closed 6 years ago
The compute unit description in RP is really a dict, whose keys are defined here. When exchanging descriptions over the network, we dump the dict to json and send that json encoding around (sometimes compressed). For some communication channels, we collect multiple descriptions into a list, as to minimize the number of messages exchanged.
Let me know if this is sufficient - if not, I can add a more formal definition of the json structure.
From that description I assume the JSON would look similar to:
{
`cpu_processes`: 4,
`executable`: `QCEngine`,
`name`: `task-id-5b8707587b8787679d2fd9ce`,
...
}
We can certainly provide the above. Can you give an example of the input/output_staging
fields that would match our use case (a JSON blob input distributed by a RADICAL client to a work node and then the output pulled back from a RADICAL client)?
We can either drop the JSON blob output to stdout or dump it to a file. Whichever is easier for you.
right. Below is a dump of an example run:
{
'kernel' : '',
'name' : '',
'tag' : None,
'executable' : '/bin/echo',
'arguments' : ['-c', 'input.dat', '126'],
'pre_exec' : [],
'post_exec' : [],
'environment' : {},
'cpu_processes' : 1,
'cpu_process_type': 'POSIX',
'cpu_threads' : 1,
'cpu_thread_type' : 'POSIX',
'gpu_processes' : 0,
'gpu_process_type': '',
'gpu_threads' : 0,
'gpu_thread_type' : '',
'lfs_per_process' : 0,
'stdout' : '',
'stderr' : '',
'input_staging' : [{'source': 'pilot:///input.dat',
'target': 'unit:///input.dat',
'flags' : 64,
'action': 'Link'
}],
'output_staging' : [{'source': 'unit:///STDOUT',
'target': 'pilot:///STDOUT.000126',
'flags' : 64,
'action': 'Copy'
}],
'restartable' : False,
'cleanup' : False
}
The structure of the staging is clear I guess: for each file, you specify src
, tgt
, and an action
, which can be Copy
, 'Transferor
Link. The flags (we use symbolic defines in the code) define behavior such as
overwriteor
recursive`.
Note that the URLs can use special schemas, which then refer to locations which are determined at runtime. Those are documented here
Gotcha, makes sense. So we would likely do something like:
'input_staging' : [{'source': 'pilot:///input-5b8707587b8787679d2fd9ce',
'target': 'unit:///input.dat',
'flags' : 64,
'action': 'Link'
}],
'output_staging' : [{'source': 'unit:///output.dat', # If we write to output.dat
'target': 'pilot:///output-5b8707587b8787679d2fd9ce',
'flags' : 64,
'action': 'Copy'
}],
and push each task specification to pilot in a input-uid
format and look for ouptut-uid
to parse.
The 'pilot' schema points to a sandbox on the remote resource specific to a pilot job. In your example above, the task stages 'input-5b*' from this sandbox to a task, and then output.dat from a task to that sandbox.
The input staging directive used above is probably okay since the pilot can be used to stage all data from client to this sandbox. Based on your description, I think your output staging would be different though, since you require output data back on the client machine. Something like:
'output_staging' : [{'source': 'unit:///output.dat', # If we write to output.dat
'target': 'client:///output-5b8707587b8787679d2fd9ce',
'flags' : 64,
'action': 'Transfer'
}],
RP does not allow metadata right now - EnTK encodes some information in the name
field`. We could add something like that - but would be cautious to blow up the task description too much, memory- and space-wise...
This is now implemented in the RP branch feature/task_metadata
:
cud = rp.ComputeUnitDescription()
cud.metadata = {'a' : [1, 2, 3]}
The above metadata
field would be perfect, thank you.
We are going to release the metadata feature with RP over the next couple of days.
v0.50.17
of RP has been released and contains that feature.
RADICAL to provide the specification of task description used in RP