antocuni / capnpy

Other
45 stars 26 forks source link

Add as_dict() method to struct #32

Open kawing-chiu opened 6 years ago

kawing-chiu commented 6 years ago

Namedtuple has the method _asdict() and pycapnp also has to_dict(). I think we should also add something equivalent to capnpy, which promptly converts a struct into OrderedDict.

colinfang commented 6 years ago

what happens if the struct contains a unnamed union or it is a nested struct?

colinfang commented 6 years ago

My colleagues do sometimes find to_dict useful for simple plain struct. Currently we add the methods via _extended.py.

antocuni commented 6 years ago

I think that it's not so easy to design something which has a reasonable behavior w.r.t. all the possible combination of capnproto features. Some random thoughts:

I am sure that depending on the exact use case, you would need slightly different answers to the questions above. So, I am tempted to say that this feature should not be part of the capnpy core, at least for now. It would be nice to have it as an external library or plugin: then, as @colinfang says, you can easily integrate inside your schema using *_extended.py.

kawing-chiu commented 6 years ago

Well...I wrote this without noticing your replies...

I haven't really used these advanced features of capnp yet. Will have a look tomorrow~

kawing-chiu commented 6 years ago

I've investigated the issue a bit more, here are my thoughts:

First of all, this issue is not about converting the whole capnp data structure into native python types, but about "shallowly" converting to a dict, so nested struct is certainly not considered and most fields don't need to be rendered. Generally, I think such kind of thing cannot and should not be done. For example:

Object = namedtuple('Object', ['dimension', 'weight'])
Dimension = namedtuple('Dimension', ['x', 'y', 'z'])
o = Object(Dimension(10, 15, 20), 50)
o._asdict()

Will the nested Dimension be converted? No. But the user can always choose to do it himself. Another example:

from types import MappingProxyType
d = {'nested': {'a': 1}, 'b': 2}
m = MappingProxyType(d)

Will m['nested'] become MappingProxyType? No. But the user can choose to do it with one more line. Also note that namedtuple._asdict() is indifferent to what the type of the field is, it can be a cffi pointer or whatever.

Secondly, I don't see how *_extended.py can solve this issue easily. My schema has ~30 fields. Maybe I missed it, I couldn't find a way to get/iterate the field names easily. So to write equivalent methods in *_extended.py, I have to list all the fields manually, this is unacceptable, given that I have already written a .capnp file containing all the relevant information.

Handling data consists of (possibly nested) dict/list of primitive types should cover at least 90% usage of a serialization library (which is quite a conservative figure, I would say). I think an api as succinct as possilbe should be provided for such usage. In our app, the serialization layer has a fixed api: dict <-> bytes, while the serialization lib can be changed. We have tried quite a few libs, most can do the job in one or two line.

kawing-chiu commented 6 years ago

Given the philosophy above, advanced fields that can normally be retrieved from attribute just works. For example group:

>>> mod = capnpy.load_schema('example_group')
>>> Point = mod.Point
>>> p = Point(position=(3, 4), color='red')
>>> p._fields
('position', 'color')
>>> p._asdict()
OrderedDict([('position', <Point.position: (x = 3, y = 4)>), ('color', b'red')])

named union:

>>> mod = capnpy.load_schema('example_named_union')
>>> Person = mod.Person
>>> p = Person(name='Bob', job=Person.Job(employer='Capnpy corporation'))
>>> p._fields
('name', 'job')
>>> p._asdict()
OrderedDict([('name', b'Bob'), ('job', <Person.job: (employer = "Capnpy corporation")>)])

There might be some corner cases left to be handled, most notably unnamed union. Even with a few exceptions, I think this feature is still very useful. The user can always choose to further process the data as needed.

kawing-chiu commented 6 years ago

As for unnamed union, I propose two possible solutions: