te25son / Jokes

CLI app for jokes
0 stars 0 forks source link

Refactor joke models #19

Closed te25son closed 2 years ago

te25son commented 2 years ago

Refactored the joke models and changed the way the models should work in general.

The Problem

I was using a lot of pattern matching in different places to validate whether the joke was a SINGLE type or a TWOPART type. This meant that if I were ever to add another joke type, I would have to find and update all of these points of validation logic. Annoying. It also meant the the model was doing a lot more logic that it should be doing. IMO the models should only represent the data, the logic should be included elsewhere. And finally, I wasn't leveraging pydantic to its best, and I included a bunch of tricky logic that could very easily get confusing (if it wasn't already).

The Solution

Create models that represent the types of jokes, i.e. SINGLE and TWOPART and use pydantic as the primary validator.

The previous code used custom validation for each joke type within a pattern matching statement.

case Type.TWOPART:
    assert setup is not None, "Setup field must be included in a twopart joke."
    assert delivery is not None, "Delivery field must be included in a twopart joke."
    assert joke is None, "Joke field cannot be included in a twopart joke."

This meant that if I were to add another type, say KNOCKKNOCK, I would have to add more validation into each statement so the above code might look like this:

case Type.TWOPART:
    assert setup is not None, "Setup field must be included in a twopart joke."
    assert delivery is not None, "Delivery field must be included in a twopart joke."
    assert joke is None, "Joke field cannot be included in a twopart joke."
    assert knock_knock_part1 is None, "knock_knock_part1 field cannot be included in a twopart joke."
    assert knock_knock_part2 is None, "knock_knock_part2 field cannot be included in a twopart joke."
    assert knock_knock_part3 is None, "knock_knock_part3 field cannot be included in a twopart joke."
    assert knock_knock_part4 is None, "knock_knock_part4 field cannot be included in a twopart joke."
    assert knock_knock_part5 is None, "knock_knock_part5 field cannot be included in a twopart joke."

I would also have to add a case for the knock knock joke type. This is ridiculous and can spiral out of control very quickly.

Luckily pydantic handles field validation for us. So instead of having a single joke class, there can be several that each represent a joke type. That way, when a joke is created, the joke class will have the necessary fields and check that those are present in the creation data.

This translates into

class JokeBase(BaseModel):
    ...

class JokeSingle(BaseModel):
    joke: str

class JokeTwopart(BaseModel):
    setup: str
    delivery: str

Not only does this make it clearer what class represents what, it also forces the logic of creating each joke type outside of the base joke class, making it less of a mystery.

And if I wanted to add another joke type it would be as easy as this:

class JokeKnockKnock(BaseModel):
    knock_knock_part1: str
    knock_knock_part2: str
    knock_knock_part3: str
    knock_knock_part4: str
    knock_knock_part5: str

No more adding custom validation and tricky logic, and crossing my fingers hoping for the best.

Now, I said the changes force the logic of creating each joke outside the base joke class. This is only partially true. The JokeBase class contains a method called match_type which will match the validated type and create a joke based on that type. However, the parameters required to call the method ask for functions. So which the match method is inside the base class, the logic of the create methods is still passed outside the class.

Why do this?

Basically there were a few options. One was to create a simple method that took a type, data, and actions to create the classes based on the type. This would have looked something like this:

def match_type(type: Type, data: dict, single_action: Callable[[dict], T1], twopart_action: Callable[[dict], T2]) -> T1 | T2:
    match type:
        case Type.SINGLE: return single_action(data),
        case TYPE.TWOPART: return twopart_action(data),
        case _: raise

So similar to what is here, but with more parameters.'

The reason I didn't like this is because it required me to validate the data with the JokeBase first then get and convert the type again after validation. So before calling this match_type function, I would've had to first call JokeBase(**data) then type = Type[data.get("type").upper()]. Since JokeBase is already validating, storing, and converting the type to uppercase, I felt this was an opportunity to leverage it a bit better. By adding the extra field in the Config class, I ensure that the data being passed to the base class is being kept intact. This means that `JokeBase(data).dict() == data`.** Without this extra field, the data not represented by fields in the class is dropped.

Now I can create my models in a smooth flow of actions. So instead of

JokeBase(**data)
type = Type[data.get("type").upper()]
match_type(
    type=type,
    data=data,
    single_action=lambda d: JokeSingle(**d),
    twopart_action=lambda d: JokeTwopart(**d)
)

I now can use

JokeBase(**data).match_type(
    single_action=lambda d: JokeSingle(**d),
    twopart_action=lambda d: JokeTwopart(**d)
)

🎉 🥳