Open dafrose opened 3 years ago
@dafrose - thanks for opening the issue! Yes, we should add this option.
I'm using yml format for building a converter (from nipype to pydra task, example for fsl is here), but pydra should be able to read the spec from yml.
Hi @djarecka , I have a few questions regarding your yml-spec:
I don't see an explicit input spec in your yaml files, only conditions that might reflect on optional inputs, but not on mandatory. How do you define inputs with this spec?
What is the distinction between filename
and cmd
?
Otherwise, I like the flexibility that the your usage of filename templates offers.
base
attribute to allow inheritance from some common structure (like previously in nipype for all FSL tools or all MRTRIX3 tools and so on...). In PyRates, we decided to use slash /
notation for absolute or relative system paths and dot .
notation for things that Python can find with its import architecture. Both could refer to either Python code or other yaml-specs. This could then look like this:
MySpec:
base: MyBase # referencing something in the same file
MySpec2:
base: ../../path/to/file/MyBase2 # relative path
MySpec3:
base: /drive/path/to/file/MyBase3 # absolute path
MySpec4:
base: path.to.file.MyBase4 # python path
Regarding file endings, we decided look for files that end with .py
, .yaml
, or .yml
in the order of listing (or rather try to import first and then look for yaml files). Would you like to include the base
attribute or rather keep it out to simplify the specification?
My notes from the call today (regarding this issue):
new suggestions based on above spec: , e.g.:
&bet_input_spec # need to check that this actually works, but could also just nest it down there
bases: pydra.engine.ShellSpec
name: Input
fields:
- name: in_file
type: pydra.File
metadata:
help_string: "input file ..."
position: 1
mandatory: True
- name: out_file
type: str
metadata:
help_string: "name of output ..."
position: 2
output_file_template: {in_file}_br
- name: mask
type: bool
metadata:
help_string: "create binary mask"
argstr: "-m",
MyBet:
bases: pydra.engine.ShellCommandTask
executable: bet
input_spec: *bet_input_spec
!!python/...
notation. !!python/object/apply
opens up to arbitrary code execution and is therefore considered "unsafe". If pydra is expected to be run on secure environments with trusted code, then this could become problematic, especially to unaware users.Do have comments on my notes or my questions @djarecka @effigies @PeerHerholz @satra @oesteban ?
Hi folks,
I am suggesting the this, because I kind of thought that it was already there but could not find it documented anywhere.
What would you like changed/added and why?
In pydra, shell task specifications are mostly text-based, with dictionaries and lists sprinkled all over the place. This could very easily be serialized into something like
yaml
orjson
(I prefer the first). In fact, when looking at something like this, it looks very similar to json syntax. So why not abstract away most of the boilerplate code and allow users to write task specifications inyaml
instead?Example (
FSL bet
)Adapting the example for
FSL bet
from the docs, this could look somewhat like this:or even nested like this
Of course there are many ways to do this and some discussion would be needed to iron out a proper specification.
What would be the benefit? Does the change make something easier to use?
That's the whole point: Ease of use. You could simply load these specifications with pydra and run them, possibly without a single line of (python) code. Of course, there might be more advanced usage where you might want to either directly include python code or reference it from the spec, but that can be done as well. Of course this should only an extension to actual python API, it should not replace it and be kept as close to it as possible.
Other projects with similar approaches
When I first saw
pydra
I immediately thought you would do this, but then I found no mention of it. It would be simple enough to write my own parser for this, but a proper specification would be better. So what do you think about this idea?PS: YAML has support for referencing actual types from program code, but I find this concept too complex for simple use cases and especially for new users.