Implement `sanitize` Method to Ensure Data Validity for Integration Systems

Description: The current mirroring methods only contain a standardize method to enforce consistent data formatting. However, to integrate with specific systems, we need a sanitize method to ensure that the target data is valid for the integration requirements. This method should be defined in the abstract class LedgerEngine with a basic realization that simply takes and returns the same DataFrame.

The sanitize method will provide flexibility for handling system-specific data validations, allowing custom sanitization rules to be applied in subclasses.

Tasks:

Define the sanitize method in the abstract class LedgerEngine:
- Method signature: sanitize(df: pd.DataFrame) -> pd.DataFrame
- Provide a docstring to explain the method's purpose.
Provide a simple default implementation in a subclass that returns the input DataFrame unchanged.
Ensure the method is prepared for future extensions, where subclasses can override the sanitize method for system-specific requirements.

Method Naming and Documentation

Overview

Great architecture choice to introduce an abstract method to pre-process data before mirroring, allowing derived classes to modify the target data to make it conformant with specific limitations of the system onto which the data shall be mirrored.

However, I am not convince by the naming "sanitize". Let's clarify the purpose and choose a name that better reflects this purpose.

Purpose

This method serves as a preparatory step for data mirroring onto the current system by adjusting the incoming (ledger, account, ...) data to align it with the data already stored. The method is invoked before the actual mirroring process. It receives incoming compliant with the abstract pyledger definition and applies adjustments—adding, modifying, or deleting records as necessary—to ensure compatibility with the data structure of the current class. This adaptation process makes it easier to compare the incoming target data with the existing data of the current class..

Naming

Proposed name:

prepare_XY_for_mirroring, where XY stands for "ledger", "accounts", etc. This name reflect the function’s role in readying data for synchronization.

Alternative naming ideas:

align_XY_for_mirroring
adapt_XY_to_class_constraints

Doc String:

Proposal for a clarified docstring:

def prepare_assets_for_mirroring(self, df: pd.DataFrame) -> pd.DataFrame:
    """Aligns incoming asset data with the current system's constraints.

    Invoked as the initial step in the mirroring process, this method prepares 
    asset data for integration into the current system. It adapts the incoming 
    data to specific storage requirements and aligns it with existing data, making 
    it easy to identify entries that need to be added, modified, or removed.

    By default, this method returns the data unchanged. Subclasses may override 
    it to apply class-specific adaptations as required.

    Args:
        df (pd.DataFrame): Incoming asset data.

    Returns:
        pd.DataFrame: Adjusted data ready for synchronization with the current system.
    """
    return self.standardize_assets(df)

Note:

Removed '@classmethod' decorator. Adaptation of data to the current class might require knowledge of the current instance (e.g. the transitory account to use when introducing additional ledger entries).
No need to call standardize_assets in the mirroring function. Its now part of the preparation step.

macxred / pyledger