cgat-developers / ruffus

CGAT-ruffus is a lightweight python module for running computational pipelines
MIT License
173 stars 34 forks source link

substitution for extras should ignore unknowns #45

Closed ghost closed 9 years ago

ghost commented 9 years ago

When passing a string, e.g. a shell command to be executed, as extra argument when using filter=formatter(), the string substitution replaces unknown keywords with {} instead of leaving them untouched.

Example: test = 'echo {dynamic_message} > {some_file}' with any task like transform(... filter=formatter()..., extras=[test]) results in echo {} > {} after substitution.

Suggestion: Ruffus should ignore unknown keywords during string substitution.

Peter

bunbun commented 9 years ago

Surprised this does not just blow up! I think there is some over cautious Exception handling here.

Is it not just easier to use double {{}} to prevent substitution? As in regular python:

>>> "{{test}}".format()
'{test}'

I can ignore unknown keywords but that probably would require catching the keyerror exception in a loop and progressively either dynamically adding the double braces or faking the missing keywords

>>> "{{test}}".format(test="{test}")
'{test}'

Either way this seems a lot of overhead to add to a process which is just going to hide missing replacements. I would prefer to make sure that an Exception is thrown.

Let me know what you think.

ghost commented 9 years ago

You are right, I hadn't thought about the {{}} option. This might indeed be easier, I'll check it tomorrow in my code and get back to you. I am still a bit fuzzy about all the string replacements in Ruffus, so I am probably off the right track here, but I was just thinking in the direction of using the __missing__ hook in an object sub-classed from dictionary and then return "{" + key + "}" in these cases. But again, no clue if this is at all applicable to Ruffus.

bunbun commented 9 years ago

You mean the equivalent of:

>>> class dict_ignoring_missing(dict):
...      def __missing__(self, key):
...          return '{' + key + '}'
...
>>> import string
>>> string.Formatter().vformat("{found} {missing}", args = [], kwargs=dict_ignoring_missing(found=True))
'True {missing}'

Does anyone still use vformat()? :-)

I have a feeling that this would be even more surprising to python users!

At the moment I am inclined to make sure that Ruffus throws a hissy fit (Exception) when there are missing keys, though hopefully with a clear Error message. That is always the challenge.

ghost commented 9 years ago

Well, it probably depends on prior expectations: Ruffus makes some assumptions (input files turned into output files) and for these assumptions to hold, there should be an error when the string replacement fails, e.g., due to missing keys, for inputs/outputs. But what are the assumptions concerning the extras=? In my understanding, there aren't any - Ruffus replaces in a best effort principle what can be replaced, otherwise it just passes the arguments to the function. Consequently, it should not change the user's arguments. However, right now, I think it all comes down to a "design decision" that needs to be properly documented and that's about it (so, maybe, this is not a bug, but a suggested enhancement for the documentation?). As you suggested, the built-in escaping mechanism {{}} to avoid changes by Ruffus' string replacement actions is likely the way to go, because... well, it's a built-in feature ;-)

bunbun commented 9 years ago

Yes. These sorts of calls are the trickiest part of Ruffus. I certainly regret some of the (now obviously excessive) syntactic flexibility I allowed earlier in the design but that will always be retained for backwards compatibility. Leo

On 24 March 2015 at 08:17, Peter Ebert notifications@github.com wrote:

Well, it probably depends on prior expectations: Ruffus makes some assumptions (input files turned into output files) and for these assumptions to hold, there should be an error when the string replacement fails, e.g., due to missing keys, for inputs/outputs. But what are the assumptions concerning the extras=? In my understanding, there aren't any - Ruffus replaces in a best effort principle what can be replaced, otherwise it just passes the arguments to the function. Consequently, it should not change the user's arguments. However, right now, I think it all comes down to a "design decision" that needs to be properly documented and that's about it (so, maybe, this is not a bug, but a suggested enhancement for the documentation?). As you suggested, the built-in escaping mechanism {{}} to avoid changes by Ruffus' string replacement actions is likely the way to go, because... well, it's a built-in feature ;-)

— Reply to this email directly or view it on GitHub https://github.com/bunbun/ruffus/issues/45#issuecomment-85392264.

bunbun commented 9 years ago

A failed substitution now fails the entire job. With verbose>=4, will result in an error message like:

Job Warning: Input substitution failed:
          Missing key = {dynamic_message} in 'echo {dynamic_message} > {some_file}'.
                input =  '',
               filter = formatter(['test_transform_formatter/a.tmp'])..