nipype / pydra

Pydra Dataflow Engine
https://nipype.github.io/pydra/
Other
120 stars 59 forks source link

Formatter argument can bypass xor and mandatory in input spec #611

Open ghisvail opened 1 year ago

ghisvail commented 1 year ago

Usage of formatter together with xor and mandatory metadata may yield inconsistent command-line output.

Given the following spec:

import pydra

input_spec = pydra.specs.SpecInfo(
    name="Input",
    fields=[
        (
            "foo",
            bool,
            {
                "help_string": "boolean option",
                "mandatory": True,
                "formatter": lambda foo: "--foo" if foo else "--no-foo"
                "xor": {"bar"},
            },
        ),
        (
            "bar",
            str,
            {
                "help_string": "pathlike option",
                "mandatory": True,
                "argstr": "--bar {bar}",
                "xor": {"foo"},
            },
        ),
    ],
    bases=(pydra.specs.ShellSpec,),
)

Attached to interface MyInterface, one should only be able to construct the following valid instances:

task = MyInterface(foo=True)          # (1)
task = MyInterface(foo=False)         # (2)
task = MyInterface(bar="/some/path")  # (3)

Cases (1) and (2) yield appropriate commandline values.

However, case (3) yields --no-foo --bar /some/path instead of just --bar /some/path hereby bypassing the xor constraint.

ghisvail commented 1 year ago

So far, the only solution I found to this was to introduce a third option for the False case:

input_spec = pydra.specs.SpecInfo(
    name="Input",
    fields=[
        (
            "foo",
            bool,
            {
                "help_string": "boolean option",
                "mandatory": True,
                "argstr": "--foo"
                "xor": {"bar", "no_foo"},
            },
        ),
        (
            "bar",
            str,
            {
                "help_string": "pathlike option",
                "mandatory": True,
                "argstr": "--bar {bar}",
                "xor": {"foo", "no_foo"},
            },
        ),
            "no_foo",
            bool,
            {
                "help_string": "other boolean option",
                "mandatory": True,
                "argstr": "--no-foo"
                "xor": {"bar", "foo"},
            },
        ),
    ],
    bases=(pydra.specs.ShellSpec,),
)

But it feels like a hack to me.

satra commented 1 year ago

the check for xors is happening before formatter is applied (here: https://github.com/nipype/pydra/blob/c2cc452bc60023eebf4ba0b28e99fc7d95e644c4/pydra/engine/task.py#L501) and doesn't take into consideration the effect of the formatter. this will require some refactoring of code.

for your current issue you could use: "formatter": lambda field: "" if field is None else "--foo" if field else "--no-foo", which uses the special named field (see this section: https://github.com/nipype/pydra/blob/c2cc452bc60023eebf4ba0b28e99fc7d95e644c4/pydra/engine/task.py#L433)

also this lambda takes into account when no value is provided for foo. i don't know why this is None and not attrs.NOTHING.

ghisvail commented 1 year ago

Thanks for looking into it @satra

I wonder whether there is any value in calling the formatter when the value for a field is unset? Let me explain.

As of today, if I provide a pattern for argstr to a field spec and do not set a value to the field, then the argstr pattern is never computed. Imo, it's not super intuitive that formatter could be called with a valueless field when argstr is not.

satra commented 1 year ago

when formatter was designed it was meant to be quite general. formatter can take values of other fields and create a value for a field that is not set as input. it can be a generative process based on other inputs. effectively it was intended to be a generalization of format_arg and parse_arg from nipype1 and the name_template.

we can consider restricting this for now if it makes life simpler.

this complexity does bring about the notion that we are currently expecting such formatters to not change once implemented in an interface. hash checking does not currently take the formatter into consideration, and should if the formatter is allowed to generate values. perhaps another reason to restrict the formatter for now.

however, we should indeed check if this is being used anywhere in such a generative mode, and if not restrict it to fields with values.

ghisvail commented 1 year ago

i don't know why this is None and not attrs.NOTHING.

You are right, it works with attrs.NOTHING:

{
    "formatter": (
        lambda field: "" 
        if field == attrs.NOTHING
        else "--foo"
        if field
        else "--no-foo"
    )
}

See below.

ghisvail commented 1 year ago

Hang on, that's interesting:

This does not work:

{
    "formatter": (
        lambda field: "" 
        if field == attrs.NOTHING
        else "--foo"
        if field
        else "--no-foo"
    )
}

But this does:

{
    "formatter": (
        lambda foo: "" 
        if foo == attrs.NOTHING
        else "--foo"
        if foo
        else "--no-foo"
    )
}

In addition to this:

{
    "formatter": (
        lambda field: "" 
        if field is None
        else "--foo"
        if field
        else "--no-foo"
    )
}
satra commented 1 year ago

it's possible field is translated while explicit field name retains the attrs coding.