lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
410 stars 52 forks source link

Parse list object from both "--a_list 1 2 3" and from "--a_list '1 2 3' " #309

Closed ludwigwinkler closed 6 months ago

ludwigwinkler commented 6 months ago

First of all, thanks for the wonderful package. I really enjoy it.

Is your feature request related to a problem? Please describe. Directly in my terminal I can easily populate a list field with variable lengths

@dataclass
class Config:
    a_list: List = list_field(1, 2, 2, 2)

Unfortunately, I've tied my soul to WandB due to their amazing sweep capability. The problem is, when running a sweep, they call my script with string arguments, such that a valid list/sequence gets passed on as

--a_list '1 2 3'

which is a string instead of

--a_list 1 2 3

and currently there is no way of fixing this on the wandb side.

Describe the solution you'd like Is there a way to include a sort of Action like from argparse which could do a quick conversion when encountering a string object instead of a proper list? An example of what I'm trying to achieve is encapsulated here: https://github.com/wandb/wandb/issues/2939#issuecomment-976193305 where an action is triggered. I'd intend to quickly check if we obtained a proper list and if it's a string, we parse it manually into a list.

Looking at the codebase I found this line: https://github.com/lebrice/SimpleParsing/blob/3194331ab68ad1f93334481a545f179234b93946/simple_parsing/helpers/fields.py#L58 which hinted at an argparse action being able to be passed to the field function?

grahamannett commented 6 months ago

I haven't used list_field nor wandb sweeps too much so havent tested out but why cant passing in action= on the list_field work? Seems like something similar to

class ParseAction(argparse.Action):
    def __call__(self, parser, namespace, values, option_string=None):
        values = list(map(int, values[0].split()))
        setattr(namespace, self.dest, values)

@dataclass
class Config(Serializeable):
    vals: list = list_field(-1, -2, -3, action=ParseAction)

would work but might have to handle values differently depending on what you run vs what wandb passes in? otherwise might be possible to use a uniontype and then __post_init__ like you said but I could see that not being super ideal

ludwigwinkler commented 6 months ago

I did that and it worked like a charm. I updated the ParseAction.__call__() method to

class ParseAction(argparse.Action):
    def __call__(self, parser, namespace, values, option_string=None):
        if type(values) == list and all([type(x) == int for x in values]):
            # [1, 2, 3] -> [1, 2, 3]
            pass
        elif type(values) == list and type(values[0]) == str and len(values) == 1 \
                and '[' not in values[0] and ']' not in values[0] \
                and ',' not in values[0]:
            # string of space separated numbers: '1 2 3' -> [1, 2, 3]
            values = list(map(int, values[0].split()))
        elif type(values) == list and type(values[0]) == str and len(values) == 1 \
                and '[' in values[0] and ']' in values[0]:
            # string of list of numbers: '[1, 2, 3]' -> [1, 2, 3]
            # int() takes care of excess spaces in string int('   -1 ')-> -1
            values = list(map(int, values[0][1:-1].split(',')))
        elif type(values) == list and type(values[0]) == str and len(values) >= 1 \
                and '[' not in values[0] and ']' not in values[0] \
                and ',' not in values[0]:
            # string of list of numbers: ['1', '2', '3'] -> [1, 2, 3]
            values = list(map(int, values))
        else:
            raise ValueError(f"Could not parse {values} to list of int")
        setattr(namespace, self.dest, values)

I wasn't aware that argparser had this useful functionality to parse arguments into whatever you want if required. It gives me granular control over how to clean the varying argument inputs.

Thanks a lot. :)

grahamannett commented 6 months ago

Glad that works. Not sure if you care but I think SimpleParsing could probably figure out some way to ingest it more generic/abstractly (meaning take an input, if its string try and match types on it, otherwise fallback to what was ingested).

Might be a lot of effort though for something that seems specific to wandb ingestion.

ludwigwinkler commented 6 months ago

There is the ast.literal_eval() function which can evaluate string representations of containers likes lists.

From my initial testing, it's pretty good at evaluating lists (as it's supposed to), but it wouldn't work with a "1 2 3" string as it's not a valid string representation of a list. Then again, you can define an action with a try: ast.litereal_eval(values) ; except process_as_sequence_ofintegers() block.