rwth-i6 / sisyphus

A Workflow Manager in Python
Mozilla Public License 2.0
45 stars 25 forks source link

`IndexError` in `DelayedFormat` when providing kwargs #193

Closed Icemole closed 5 months ago

Icemole commented 5 months ago

Hi all, I was trying to use DelayedFormat with a set of kwargs and I got the following error:

IndexError: Replacement index 0 out of range for positional args tuple

By looking at the source code, the error seems to come to light:

class DelayedFormat(DelayedFunctionBase):
    def get(self):
        return try_get(self.string).format(*(try_get(i) for i in self.args), **self.kwargs)

Source.

The reason it's failing is that the values inside self.kwargs can't be unpacked in the way str.format() would like to use them. Indeed, I get the same error if I try something as basic as:

>>> "{}".format(**{"a": "a"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Replacement index 0 out of range for positional args tuple

Was this perhaps a copy-paste issue? There's no way any keyword arguments will be respected in the formatting function. Moreover, if I remember correctly the dictionary composed of the kwargs isn't guaranteed to have the right order before python 3.something (3.7?), so kwargs.values() wouldn't work either.

If you think I'm using the code wrong, please let me know.

albertz commented 5 months ago

I don't exactly understand why you think there is any issue in DelayedFormat. This exception is exactly what you would expect here. All is correct. If you want to use kwargs this way, the Python str.format works this way:

"{a}.format(a="a")

The example with CodeWrapper I gave before (here) was slightly wrong, it should be like this:

CodeWrapper(DelayedFormat('lambda files: get_sub_epoch_dataset(files, **({}))', whatever_extra_kwargs_you_want))
Icemole commented 5 months ago

I see, I didn't know that the formatted string should have the kwarg keyword between braces, thanks for clarifying.

I just tried what I show below:

returnn.config.CodeWrapper(
    DelayedFormat(
        "lambda files: get_sub_epoch_dataset(files, **({})", dataset=dataset_copy, num_workers=num_workers, local_caching=local_caching,
    )
),

But I still get the error IndexError: Replacement index 0 out of range for positional args tuple. Perhaps I'm still using it wrong?

albertz commented 5 months ago

Yes it's exactly wrong in the same way? You are using here a positional arg ({} is a positional arg, i.e. not a keyword arg; it refers to the first positional arg) but then you only provide keyword args? Of course that's wrong. For example, you could do it like this:

returnn.config.CodeWrapper(
    DelayedFormat(
        "lambda files: get_sub_epoch_dataset(files, **({}))",
        dict(dataset=dataset_copy, num_workers=num_workers, local_caching=local_caching),
    )
),
michelwi commented 5 months ago

If you think I'm using the code wrong, please let me know.

additionally to alberts suggestion above (which provides a positional arg to fill a positional arg formatting slot) I would offer

returnn.config.CodeWrapper(
    DelayedFormat(
        "lambda files: get_sub_epoch_dataset(files, dataset={dataset}, num_workers={num_workers}, local_caching={local_caching})",
        dataset=dataset_copy,
        num_workers=num_workers,
        local_caching=local_caching,
    )
),

which uses kwargs for both the formatting string and the arguments to str.format.

There's no way any keyword arguments will be respected in the formatting function. Moreover, if I remember correctly the dictionary composed of the kwargs isn't guaranteed to have the right order before python 3.something (3.7?), so kwargs.values() wouldn't work either.

In this case you are responsible yourself that the kwargs defined in the formatting string match those that are passed to the function. If they match, then the order would not matter. (dicts are sorted in order of insertion in python3.6+)

Icemole commented 5 months ago

Thanks for the feedback everyone. I'll close this issue since the error was mine.

Edit: just to clarify, everything is working as intended, I just had to set the string to what @michelwi said above.