brettviren / moo

ruminants on module oriented programming
GNU General Public License v3.0
4 stars 4 forks source link

Nested records become generic `dict` in `otypes` #38

Open brettviren opened 1 year ago

brettviren commented 1 year ago

@plasorak reports that a record held in the field of another record becomes represented merely as a dict when the parent record is instantiated through otypes.

import os
import moo

here = os.path.dirname(__file__)
moo.io.default_load_path = [here]

def test_with_ab():
    moo.otypes.load_types("./issue16-schema.jsonnet")
    from test.issue16 import A, B, AB
    ab = AB(a=A(), b=B())
    print(type(ab), type(ab.a), type(ab.b))
    assert not type(ab) == dict
    assert not type(ab.a) == dict

That last line asserts with:

E       AssertionError: assert not <class 'dict'> == dict
E        +  where <class 'dict'> = type({'name': 'name'})
E        +    where {'name': 'name'} = <record AB, fields: {a, b}>.a
brettviren commented 1 year ago

After reviewing moo code it appears that this behavior is intentional, for better or worse.

A record field object is actually stored in the record object as an otype but when that field is accessed as an attribute the .pod() method is called on the field object and its result is returned as plain-old-data.

It's actually a trivial change to make a Record attribute access return an otype by just not calling its .pod(). However, as-is, this breaks many of the unit tests and, presumably, existing code in the wild.

For example, a Record field holds an otypes.Sequence. When this type is returned, the calling code assumes now it is list-like (because it is indeed a list as converted by the .pod() call). However, Sequence itself does not provide a __getitem__() method and when it is returned as an otype any attempt to access elements by their index will fail.

So, allowing otype types to be exposed would require all of them to fully type-pun the type they represent. Eg, Number would need methods to allow it to participate in arithmetic. This is possible to add but needs a rather exhausting coding session. And, the result would suffer performance problems. Eg, each time a Number is used in arithmetic, a .pod() call would be needed.

Really, having Record provide attribute access in the first place was a mistake as it fools the user into expecting more! The intention with otypes was to provide a constructive pattern followed by a .pod() conversion to POD.

At the end, I think the only difference we are talking about here is allowing o.field vs o["field"] syntax?

Perhaps the solution is to adopt or develop some kind of POD-to-object converter? Something simple based on use of namedtuple to replace dict is probably feasible.

Here is an interesting hack I found:

https://stackoverflow.com/a/15882327

import json
from collections import namedtuple
pod = ab.pod()
print(pod)
# {'a': {'name': 'name'}, 'b': {'count': 42}}
jtext = json.dumps(pod)
ab2 = json.loads(jtext, object_hook=lambda d: namedtuple('X', d.keys())(*d.values()))
print (ab2)
# X(a=X(name='name'), b=X(count=42))
print (ab2.b.count)
# 42

Perhaps something slightly fancier which can use a more relevant name than X could be devised.