macxred / pyledger

Python package to streamline the implementation and management of accounting systems.
MIT License
0 stars 0 forks source link

Improve Navigation in Large Class Definitions #3

Closed lasuk closed 1 month ago

lasuk commented 2 months ago

Description

Some of our units contain moderately large class definitions, making them challenging to navigate.

Background

I consulted ChatGPT regarding practices for spreading large class definitions across multiple code units in open-source Python projects. We already use two of the suggested methods in our current practices, and the third is not suitable for our context.

  1. Inheritance: See the class hierarchy section in our package README.
  2. Composition: See CashCtrlLedger._client.
  3. Mixins were also suggested, but they do not fit our use case since our ledger system is a coherent set of information that should not be split into sub-fields implemented as mixins.
Click for full ChatGPT thread.

# Python Class Definition Across Multiple Units ## User Query Can a large class definition in Python be spilt across multiple units of code? How? Is this considered good practice? How often is this done in open source python packages, and where? [After the question is answered:] Provide a markdown file of this thread for download and later insertion into a GitHub issue. ## Response Yes, a large class definition in Python can be split across multiple units of code, typically achieved by using inheritance, composition, or mixins. Here’s how each method works and its implications: ### 1. Inheritance Inheritance allows you to define a base class with shared functionality and then extend this base class in one or more subclasses. **Example:** ```python # base_class.py class BaseClass: def __init__(self, name): self.name = name def common_method(self): print(f"Common method in {self.name}") # sub_class.py from base_class import BaseClass class SubClass(BaseClass): def specific_method(self): print(f"Specific method in {self.name}") # usage.py from sub_class import SubClass obj = SubClass("Example") obj.common_method() obj.specific_method() ``` ### 2. Composition Composition involves creating classes that contain instances of other classes. This helps in breaking down the functionality into smaller, more manageable units. **Example:** ```python # engine.py class Engine: def start(self): print("Engine started") # car.py from engine import Engine class Car: def __init__(self): self.engine = Engine() def drive(self): self.engine.start() print("Car is driving") # usage.py from car import Car car = Car() car.drive() ``` ### 3. Mixins Mixins are a way to add common functionality to multiple classes without using inheritance. They are small, focused classes that provide specific functionality. **Example:** ```python # printable.py class PrintableMixin: def print_info(self): print(f"Info: {self.info}") # entity.py class Entity: def __init__(self, info): self.info = info # person.py from printable import PrintableMixin from entity import Entity class Person(Entity, PrintableMixin): def __init__(self, name): super().__init__(info=name) # usage.py from person import Person person = Person("John Doe") person.print_info() ``` ### Good Practice Considerations 1. **Clarity and Maintainability**: Splitting classes can make code more modular, easier to understand, and maintain. However, overusing these techniques can lead to a fragmented codebase that is difficult to navigate. 2. **Functionality Grouping**: Group related functionality together. If the functionality is too scattered, it might indicate a design flaw. 3. **Documentation**: Ensure proper documentation and comments, as splitting code across files can make it harder to understand the flow and relationships. ### Usage in Open Source Python Packages Splitting class definitions is commonly found in large and complex open-source projects. Examples include: 1. **Django**: Uses mixins extensively in its class-based views to provide various pieces of functionality. 2. **Flask**: Utilizes blueprints and extensions to keep the core functionality modular and maintainable. 3. **Pandas**: The library's core data structures (Series, DataFrame) and methods are split into various modules and files for better organization. To see examples, you can explore the source code of these projects on GitHub: - [Django GitHub Repository](https://github.com/django/django) - [Flask GitHub Repository](https://github.com/pallets/flask) - [Pandas GitHub Repository](https://github.com/pandas-dev/pandas) In these repositories, you will notice how classes and functionalities are modularized to maintain a clean and manageable codebase.

Defining large classes in a single module seems to be common practice. The pandas project, for example, has opted for defining their DataFrame functionality in a single unit, with the module frame.py stretching over 13,640 lines.

I conclude that the scope of our classes and their attribution to Python units is sensible.

Proposal

We can enhance the readability and navigability of our class definitions by adopting comments to divide the long unit into sections, improving navigability.

Django:


# These methods are part of the public API, with default implementations.

def ...

Pandas:


# ----------------------------------------------------------------------
# Constructors

def ...

However, using comments to structure code does not seem to be usual for open source python projects. Here are a few examples of large units without structuring comments:

Suggested Improvement

Adopt comments similar to pandas to group functionality of our ledger classes into sections (e.g., VAT, accounts, ledger, price, etc.) using comments. This will make it easier for developers to navigate and maintain the code.

Action Items

  1. Identify logical sections within each large class.
  2. Add clear, descriptive comments to delineate these sections.
  3. Ensure consistency in the style and format of these comments across the codebase.