macxred / pyledger

Python package to streamline the implementation and management of accounting systems.
MIT License
0 stars 0 forks source link

Implement `sanitize` Method to Ensure Data Validity for Integration Systems #45

Open AlexTheWizardL opened 1 week ago

AlexTheWizardL commented 1 week ago

Description: The current mirroring methods only contain a standardize method to enforce consistent data formatting. However, to integrate with specific systems, we need a sanitize method to ensure that the target data is valid for the integration requirements. This method should be defined in the abstract class LedgerEngine with a basic realization that simply takes and returns the same DataFrame.

The sanitize method will provide flexibility for handling system-specific data validations, allowing custom sanitization rules to be applied in subclasses.

Tasks:

  1. Define the sanitize method in the abstract class LedgerEngine:
    • Method signature: sanitize(df: pd.DataFrame) -> pd.DataFrame
    • Provide a docstring to explain the method's purpose.
  2. Provide a simple default implementation in a subclass that returns the input DataFrame unchanged.
  3. Ensure the method is prepared for future extensions, where subclasses can override the sanitize method for system-specific requirements.
lasuk commented 6 days ago

Method Naming and Documentation

Overview

Great architecture choice to introduce an abstract method to pre-process data before mirroring, allowing derived classes to modify the target data to make it conformant with specific limitations of the system onto which the data shall be mirrored.

However, I am not convince by the naming "sanitize". Let's clarify the purpose and choose a name that better reflects this purpose.

Purpose

This method serves as a preparatory step for data mirroring onto the current system by adjusting the incoming (ledger, account, ...) data to align it with the data already stored. The method is invoked before the actual mirroring process. It receives incoming compliant with the abstract pyledger definition and applies adjustments—adding, modifying, or deleting records as necessary—to ensure compatibility with the data structure of the current class. This adaptation process makes it easier to compare the incoming target data with the existing data of the current class..

Naming

Proposed name:

Alternative naming ideas:

Doc String:

Proposal for a clarified docstring:

def prepare_assets_for_mirroring(self, df: pd.DataFrame) -> pd.DataFrame:
    """Aligns incoming asset data with the current system's constraints.

    Invoked as the initial step in the mirroring process, this method prepares 
    asset data for integration into the current system. It adapts the incoming 
    data to specific storage requirements and aligns it with existing data, making 
    it easy to identify entries that need to be added, modified, or removed.

    By default, this method returns the data unchanged. Subclasses may override 
    it to apply class-specific adaptations as required.

    Args:
        df (pd.DataFrame): Incoming asset data.

    Returns:
        pd.DataFrame: Adjusted data ready for synchronization with the current system.
    """
    return self.standardize_assets(df)

Note: