petebachant / petebachant.github.io

My personal website.
http://petebachant.me
1 stars 0 forks source link

Classes article #21

Open petebachant opened 11 months ago

petebachant commented 11 months ago

If you need to use a class, return it from a function, and definitely don't have your users do things with them to mutate them.

Title: When you need to solve a programming problem, please don't start with a class

Classes have their place, but I've seen some people get locked into the assumption that a class is the place to start. It inspires overdesign. Most programming fits a procedural paradigm as the first solution. If you see long-lived state that is worth encapsulating, refactor to use a class later, but don't start with one.

class ThingDoer:
    def __init__(self, username, password):
        pass

    def do_thing(self):
        pass

    def do_other_thing(self):
        pass

Usage

from thingdoer_module import ThingDoer

thingdoer = ThingDoer()

thingdoer.do_thing()
thingdoer.do_other_thing()

Starting with procedures

def do_thing():
    # Get API token
    pass

def do_other_thing():
    # Get API token
    pass

Usage:

import thingdoer_module

thingdoer_module.do_thing()
thingdoer_module.do_other_thing()

Hey, we do the same thing twice, so we should refactor to avoid authenticating, assuming that's expensive. We don't need a class for that though. Modules can also store state. Is this more complex or less?

We start with a better outer interface in the 2nd example. Why the hell should a user be force to instantiate a ThingDoer?

petebachant commented 5 months ago

Another example of a weird OOP interface for a design tool: https://topfarm.pages.windenergy.dtu.dk/TopFarm2/notebooks/sgd_slsqp_comparison.html#Layout-Optimization-with-SGD-driver

petebachant commented 3 weeks ago

Maybe "how to refactor your 'doer' classes to be proper datatypes". Classes are for encapsulating state and should be named as such. This exercise is good because it forces you to think about what state you're encapsulating and name the class appropriately.

class Calculator:
    def add(self, num1, num2):
        return num1 + num2

You're encapsulating no state here--just use a function!

What if you want to store the last result? You can still use a module, unless you really need multiple calculator instances. I can actually see this as a good use for a class because a calculator is a real thing. Maybe Adder is a better bad example.

class ResultHistory:
    def append(self, command, result):
        pass

Main point should be to not write procedures as classes. If you need intermediate state, put it in a result datatype. If you need to partially run the procedure and put intermediate state into it, use a debugger or write pure functions.

class MyLongProcedure:
    def step1(self):
        self.step1_result = int(time.time())

    def step2(self):
        self.step2_result = self.step1_result + 1

    def step3(self):
        self.step3_result = self.step2_result + 1

    def run(self):
        self.step1()
        # What if I crash here and want to inspect step1_result?
        self.step2()
        self.step3()
def my_long_procedure() -> dict:
    step1_result = 5
    step2_result = step1_result + 1
    step3_result = step2_result + 1
    return {"step1_result": step1_result ...}
class LongProcedureResult(BaseModel):
    step1_result: int
    step2_result: int
    step3_result: int

But with a class, I can bundle the result with the code that generated it! That's bad. That code shouldn't be run again, right? You will need to take care to not mutate the state and make the result invalid, or you may not know how it was generated. Imagine I run step1 again without the rest. The entire object is corrupt. Make it easy on yourself: Minimize mutable state in your application!

Use classes when your problem absolutely requires a class. Otherwise, write a big function and refactor later.

Your code will be simpler.

What if I want to use inheritance to reduce duplicate code because I have a bunch of similar procedures? Don't. Inheritance is bad. Duplicate code isn't that bad either.

petebachant commented 2 weeks ago

https://www.confessionsofadataguy.com/is-python-oop-the-devil-or-savior/

petebachant commented 6 days ago

Write as modules because they are much less stateful.