networktocode / diffsync

A utility library for comparing and synchronizing different datasets.
https://diffsync.readthedocs.io/
Other
155 stars 26 forks source link

Feature request: self-referencing models #238

Closed jamesharr closed 1 year ago

jamesharr commented 1 year ago

Environment

Proposed Functionality

Support a self-referencing models, where a given model can be a child of itself to support infinite-depth hierarchies.

For the most part, DiffSync seems to support this,

Use Case

Note that we're currently not sure if we need this functionality. We're exploring options and our problem might be solved by a flat object hierarchy with object ordering. Never the less, I figured I'd start a discussion thread about it in case others had a need for this and/or if this is actually a bug.

Example use-cases:

Example model with some test code.

from __future__ import annotations

import json
from typing import List, Mapping

import structlog
from diffsync import DiffSync, DiffSyncModel, Diff
from diffsync.enum import DiffSyncStatus, DiffSyncFlags
from structlog.stdlib import BoundLogger

class Tenant(DiffSyncModel):
    _modelname = "tenant"
    _identifiers = ("name",)
    _shortname = ()
    _attributes = ("display",)
    _children = {"tenant": "children"}

    children: List[Tenant] = []

    name: str
    display: str

class TestBackend(DiffSync):
    tenant = Tenant
    top_level = [ "tenant" ]
    logger: BoundLogger

    def __init__(self, logger=None, dry_run=None):
        super().__init__()

        self.logger = structlog.get_logger("TestBackend")

    def load1(self) -> None:
        """Load an example tree"""

        # Create some sample things
        t1 = Tenant(name="a", display="All the things")
        self.add(t1)

        t2 = Tenant(name="a/b", display="Buzz")
        t1.add_child(t2)
        self.add(t2)

        t3 = Tenant(name="a/b/c", display="See you later")
        t2.add_child(t3)
        self.add(t3)

    def load2(self) -> None:
        """Load an example tree, similar to load1(), except with some attribute changets"""

        # Create some sample things
        t1 = Tenant(name="a", display="Al the tings")
        self.add(t1)

        t2 = Tenant(name="a/b", display="bzzzz")
        t1.add_child(t2)
        self.add(t2)

        t3 = Tenant(name="a/b/c", display="See you now")
        t2.add_child(t3)
        self.add(t3)

def demo1():
    be1 = TestBackend()
    be1.load1()

    be2 = TestBackend()
    be2.load2()

    # Preview diff
    diff = be1.diff_to(be2)
    print(diff.str())
    print(diff.dict())
    # NOTE: The diff shows some duplicates

    # Sync
    be1.sync_to(be2)
    # Note that `a/b` is updated 2 times
    # Note that `a/b/c` is updated 3 times
Kircheneer commented 1 year ago

I believe there is at least some overlap with #225 here - can you check if that would work for you as well?

jamesharr commented 1 year ago

Duh, yeah, this is a duplicate