kbilsted / StatePrinter

Automating unit testing and ToString() coding
Apache License 2.0
92 stars 31 forks source link

Path-based references #65

Open OmegaJak opened 2 years ago

OmegaJak commented 2 years ago

The current method used for avoiding cycles and duplicate printing, using index-based references, is not serving us well in the use case of StatePrinter combined with ApprovalTests as a way of ensuring object state doesn't change over time. If one new reference is added or removed somewhere 'early' in the object, the indices of all following references are incremented, making it very difficult to tell what really changed. For example, if we have this printed object:

Root = new RootObject()
{
    _container = new Container(), ref: 1
    {
        _objects = new List<Object>()
        {
            [0] = new A(), ref: 0
            {
            }
        }
        _a = -> 0
    }
    _otherObject = new OtherObject()
    {
        _container = -> 1
    }
    [lots more objects with lots more references]
}

If the reference to the object A() is removed, all following reference indices will be decremented. When looking at the change to the object, there will be a diff for each line in the file that previously had a reference on it. For very large files (1000s of lines), it becomes tedious to spot the real change, especially if there are other real changes later in the file.

I propose a 'path-based' way of referencing instead. The above example would then look something like this:

Root = new RootObject()
{
    _container = new Container()
    {
        _objects = new List<Object>()
        {
            [0] = new A()
            {
            }
        }
        _a = -> Root._container._objects[0]
    }
    _otherObject = new OtherObject()
    {
        _container = -> Root._container
    }
    [lots more objects with lots more references]
}

With this approach, changes wouldn't ripple to unrelated objects like described above.

I'm not sure if it's better to do an absolute path from root (as shown above) or a relative path. Maybe that could be user-configurable. With relative paths, the above references could then look something like this._objects[0] and ^._container.

If I get the time, I might take a first stab at implementing this. I imagine it will require some fairly substantial changes in Introspector.

kbilsted commented 2 years ago

I think it is a good idea. The path-id was a shortcut to getting something out for people to use.

It certainly will be an implementation challenge when you have composite keys. I think the most common are int, tuple and string, but in the odd case you could have objects as keys! That could mean 10+ fields to denote a key!

So keys may end up rather lenghty, but as you mention, at least they are stable. So I think we shall opt for stability over readability.

So either do not support objects as keys (other than a select few), or perhaps we can add configuration to filter fields when they are to be shown as a composite-key path. Likely, only a "primarykey" will suffice in the path.

interesting test cases of keys

I wonder how well you traversal will be with the following the minute you have obscure datastructures such as

student ->
      Dictionary<tuple<string,int>,list>->
            List<Person>
                 -> Dictionary<Teacher, Grades> 
                       -> Grade
                             -> student
OmegaJak commented 2 years ago

Oh that's an excellent point. I hadn't thought of complex objects as keys... that significantly complicates the problem.

Maybe an easy first implementation would only use the new approach if the entire path is direct references or inside collections with keys that are treated as "Simple Values" by StatePrinter. Otherwise, use the old indexed approach.