python / cpython

The Python programming language
https://www.python.org
Other
62.75k stars 30.08k forks source link

improve argparse.Namespace __repr__ for invalid identifiers. #68548

Closed eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc closed 9 years ago

eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago
BPO 24360
Nosy @bitdancer, @berkerpeksag, @serhiy-storchaka, @Carreau
Files
  • improve-namespace-repr.patch: improve argparse.namespace repr.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/berkerpeksag' closed_at = created_at = labels = ['type-feature', 'library'] title = 'improve argparse.Namespace __repr__ for invalid identifiers.' updated_at = user = 'https://github.com/Carreau' ``` bugs.python.org fields: ```python activity = actor = 'mbussonn' assignee = 'berker.peksag' closed = True closed_date = closer = 'berker.peksag' components = ['Library (Lib)'] creation = creator = 'mbussonn' dependencies = [] files = ['39593'] hgrepos = [] issue_num = 24360 keywords = ['patch'] message_count = 12.0 messages = ['244655', '244659', '244778', '244779', '244788', '244792', '244795', '244796', '247580', '247624', '247626', '247633'] nosy_count = 7.0 nosy_names = ['bethard', 'r.david.murray', 'python-dev', 'berker.peksag', 'paul.j3', 'serhiy.storchaka', 'mbussonn'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue24360' versions = ['Python 3.6'] ```

    eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago

    The argparse Namespace can be missleading in case where the args names are not valid identifiers, eg thinks like a closing bracket:

    In [5]: Namespace(a=1, **{')':3}) Out[5]: Namespace()=3, a=1)

    more funny:

    In [3]: Namespace(a=1, **{s:3}) Out[3]: Namespace(a=1, b=2), Namespace(c=3)

    for s = 'b=2), Namespace(c'

    With this patch the args that are not valid identifiers are shown in ** unpacked-dict, which has the side effect of almost always having repr(eval(repr(obj)))== repr(obj). I'm sure we can find counter example with quotes and triple-quoted string... but anyway.

    with this patch (indentation mine for easy comparison):
    >>> from argparse import Namespace
    >>> Namespace(a=1, **{')': 3})
        Namespace(a=1, **{')': 3})

    Which is I think what most user would expect.

    Test passes locally (except SSL cert, network thingies, curses and threaded_lru_cache) which look unrelated and is most likely due to my machine.

    serhiy-storchaka commented 9 years ago

    LGTM.

    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 9 years ago

    Off hand I don't see a problem with this patch (but I haven't tested it yet).

    But I have a couple of cautions:

    The docs say, regarding the Namespace class:

    This class is deliberately simple, just an object subclass with a readable string representation.

    This patch improves the 'readable' part, but adds some complexity.

    The docs also note that the user can provide their own namespace object, and by implication, a custom Namespace class with this improved '__repr__'.

    The Namespace '__repr__' is mainly of value during code development, especially when trying ideas in an interactive shell. It's unlikely that you would want to show the whole namespace to your end user. So even if your final API requires funny characters, you don't need to use them during development.

    eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago

    Sure and anyway if you have a huge namespace, things will become unreadable. But during development/teaching, having object that have a "sane" representation is useful, otherwise your brain (well at least mine), choke on the output and break the flow of my thoughts.

    One could also just use __repr(self) = repr(self.__dict), that woudl be even simpler and readable :-)

    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 9 years ago

    An alternative would be to wrap a non-identifier name in 'repr()':

        def repr1(self):
            def fmt_name(name):
                if name.isidentifier():
                    return name
                else:
                    return repr(name)
            type_name = type(self).__name__
            arg_strings = []
            for arg in self._get_args():
                arg_strings.append(repr(arg))
            for name, value in self._get_kwargs():
                arg_strings.append('%s=%r' % (fmt_name(name), value))
            return '%s(%s)' % (type_name, ', '.join(arg_strings))

    This would produce lines like:

        Namespace(baz='one', 'foo bar'='test', 'x __y'='other')
    
        Namespace(a=1, b=2, 'double " quote'='"', "single ' quote "="'")
    Namespace(')'=3, a=1)
    
    Namespace(a=1, 'b=2), Namespace(c'=3)

    With names that are deliberately messy, it is hard to say which is clearer.

    eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago

    Namespace(a=1, 'b=2), Namespace(c'=3)

    :-) I read that a prime-b=2 and c-prime=3.

    I just feel like having a repr which is closer to the constructor signature is better, but I guess it's a question of taste. Anyway, both would be fine.

    7a064fe6-c535-4d80-a11f-a04ed39056c5 commented 9 years ago

    I mentioned in the related bug/issue that no one has to use odd characters and spaces in the Namespace. While they are allowed by 'getattr' etc, the programmer has the option of supplying rational names in the 'dest' parameter.

    There's also the question of what kinds of strings can you supply via 'sys.argv'. For example, I have to use quotes to enter

        $ python echoargv.py '--b=2), Namespace(c' test

    Without them 'bash' gives me an error.

    Strings like this may be nice for exercising a patch, they may not be realistic in full argparse context.

    eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago

    As far as I remember, argparse is not only use to parse things from sys.argv where the quoting is not necessary. And Namespace is not only use in argparse.

    But if you don't want improvement, feel free to close.

    bitdancer commented 9 years ago

    If one is going to have a repr at all, I think it should be as accurate as practical. I think this is worthwhile, and favor the existing patch.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 9 years ago

    New changeset dcc00d9ba8db by Berker Peksag in branch 'default': Issue bpo-24360: Improve __repr__ of argparse.Namespace() for invalid identifiers. https://hg.python.org/cpython/rev/dcc00d9ba8db

    berkerpeksag commented 9 years ago

    Thanks for the patch, Matthias.

    eadbd4d3-cbb4-42b8-8420-9f80dcde2bcc commented 9 years ago

    Thanks for accepting the patch. Looking forward to 3.6 ! :-)