SasView / sasview

Code for the SasView application.
BSD 3-Clause "New" or "Revised" License
51 stars 41 forks source link

Assure all classes have well defined contracts #2071

Open lucas-wilkins opened 2 years ago

lucas-wilkins commented 2 years ago

With the end of using a disciplined, object oriented approach, dynamically changing an instance's public-facing fields should be avoided.

Without a static definition of class memebers it is difficult to interrogate code off-line, as neither you nor automated tools can look at a classes code and know what attributes it has. It is poor from the perspective of readability and maintainability.

Additionally: 1) this breaks a lot of developer tools that are designed for providing feedback about incorrect codes 2) It breaks the assumptions of the typing system 3) It makes automatic refactoring very difficult - they are bugs waiting to happen

The task:

Find all hasattr and setattr, and replace with code that does not require this (minimally, optional defaults).

pkienzle commented 2 years ago

You can require that a class predefine its public attributes by inheriting from the following (untested):

class Locked:
    def __setattr__(self, key, value):
        if not key.startswith('_') and not hasattr(self.__class__, key):
            raise AttributeError(f"Cannot add attribute {key} to class {self.__class__.__name__}")
        object.__setattr__(self, key, value)

This does a runtime check so it slows down the code but it may be worthwhile in a distributed project with part-time developers.

Lint will flag any assignment to an attribute outside of __init__ but this is not enforced at runtime. Python 3.8 added a final keyword to the type system which flags attempts to override attributes and methods in subclasses.

For more safety use non-mutable objects such as tuple rather than list. There is no frozendict but you can use types.MappingProxyType(mapping) to get a read-only view on a dict.

lucas-wilkins commented 2 years ago

That might be worth doing in a few choice places, where people might be most tempted, once the offending attributes are reified/relocated. I don't think the overhead is a particular issue compared to the general slowness of python. If it was an issue, could always make it an assertion so that it can be turned off with a runtime flag. I would also discourage it for private dynamically created attributes, the same issues exist, they just don't leak outside of the code and infect other things.

My preference in most cases would be to just fix the instances where it exists, and let enough people in the community know why its an issue that new cases get stopped after a PR.

final is a bit dangerous. IMO whatever is declared final really has to be final for it not to just get in the way. As such, it really only finds its proper place in core modules who's behaviour is set in stone and really needs to be protected.

I would go further in your tuple suggestion and say that unless something really is a list, i.e. a arbitrary sized collection of objects of the same type, it should be a namedtuple (a properly typed one too). Might be asking a bit much though.

pkienzle commented 2 years ago

It might be fun to write an import hook which scans the imported sas/sasmodels modules and inserts the __setattr__ method into any class defined in that file that doesn't already have a __setattr__.

You can do other things such as check that each symbol has a type signature. It's like lint, but with teeth.

lucas-wilkins commented 2 years ago

LOL - this is never happening

rozyczko commented 2 years ago

reopened since I think this is a valid issue and should be addressed at some point. Setting attributes dynamically is a nightmare to debug and if we get rid of it, our lifes will be so much easier...

llimeht commented 2 years ago

I'd add to this item, the contract between classes also needs to distinguish between class and instance variables. In looking at classes, we should be removing places where class variables are declared and are then later shadowed by instance variables.

The codebases have lots of instances of:

class Foo:
    myvar: str
    def __init__(self):
        self.myvar = "Bar"

That's also a nightmare to debug and also breaks any tools that might help. Some of these are needed, but many are not.

I can dream of one day running pylint and getting useful results for sasview...