python-attrs / attrs

Python Classes Without Boilerplate
https://www.attrs.org/
MIT License
5.23k stars 364 forks source link

Programmatic creation of attributes with invalid names appears to be possible. #145

Open gvoysey opened 7 years ago

gvoysey commented 7 years ago

in an attempt to make an attrs-decorated class programmatically from the output of the docopt argument parser, I discovered a situation where it is possible to create an attrs class whose attributes appear to have names that contain invalid characters.

In the code below, input_args, the instance of the class InputArgs, contains an attribute ostensibly named print-config, and I have no idea how to access the value of that attribute later, since it gets tokenized to input_args.print - config, which attempts to subtract an unintialized variable from a nonexistent attribute. The problem, I think, is that print-config is a perfectly valid string key in a dict, but it's an invalid attribute name.

I'm happy enough to sanitize things on my end, but I feel like this is a small example of a general issue that I don't understand enough of.

import attr

def from_docopt_dict(docopt_dict):
    # sanitize the keys: can't start with any instances of '-', like how every key in this dict starts.

    for key in docopt_dict:
        clean = key.lstrip('-')
        docopt_dict[clean] = docopt_dict.pop(key)

    # make a class out of it    
    retval = attr.make_class(name='InputArgs', attrs=docopt_dict)()
    return retval

if __name__ == "__main__":
    docopt_dict= {'--config': None, 
        '--help': False,  
        '--input': 'C:\\\\Users\\\\Foo\\\\Documents',  
        '--output': None, 
        '--print-config': False,  
        '--verbose': False,  
        '--version': False}
    input_args = from_docopt_dict(docopt_dict)
    print("dict:")
    print(docopt_dict)
    print(input_args)
hynek commented 7 years ago

The general issue is that attrs doesn’t sanitize attribute names for you at the moment. There’s a similar bug in the tracker too: #107

gvoysey commented 7 years ago

language of consenting adults, i suppose.

i've rolled my own on my end for now, which raises an error if if encounters a name that's against the python3 spec. On Sat, Feb 11, 2017 at 01:18 Hynek Schlawack notifications@github.com wrote:

The general issue is that attrs doesn’t sanitize attribute names for you at the moment. There’s a similar bug in the tracker too: #107 https://github.com/hynek/attrs/issues/107

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hynek/attrs/issues/145#issuecomment-279124758, or mute the thread https://github.com/notifications/unsubscribe-auth/ADeR7w_IKlUKzvU608y_Gt5C39SQ393Gks5rbVKcgaJpZM4L99mx .

hynek commented 7 years ago

how complicated is that code?

gvoysey commented 7 years ago

It's ugly, but not complicated.

I scrounged it from this stack answer, and dumped it in https://github.com/gvoysey/attrs-utils , which is my collection of attrs-related stuff that i don't think is cool enough to show off yet ;-)

Right now it's implemented as private methods here.

altendky commented 6 years ago

@gvoysey AFAIK valid attribute, uh, things, and attribute names you can type in code are not the same set. You can getattr(the_object, 'some string-with . all sorts? of weird stuff') to get attributes identified by strings or the_object.__dict__[None] for arbitrary (hashable) objects. I certainly discourage doing these things but they can be done.

https://repl.it/@altendky/ReliableBurdensomeDownloads-1

crazy_attribute_name = 'some string-with . all sorts? of weird stuff'

class X:
    def __init__(self, that_thing, another):
        setattr(self, crazy_attribute_name, that_thing)
        self.__dict__[None] = another

x = X(42, 'blue')

print(getattr(x, crazy_attribute_name))
print(x.__dict__[None])

That said, it may still be nice for attrs to... raise an exception? sanitize? be user configurable as to which to do?

Since I bothered to check I'll include the link to valid identifiers here as the same rules apply to attribute names you can write in code.

https://docs.python.org/3/reference/lexical_analysis.html#identifiers

wsanchez commented 6 years ago

I think raising an ValueError or similar is more appropriate than sanitizing the given data. Sanitizing the data is likely to just delay an error to a less-obviously-related-to-the-problem part of the caller's code.

We can provide a sanitizing function if one wants to do that explicitly and we think it's attrs' job to provide that (I'm not so sure).