Open kawing-chiu opened 6 years ago
what happens if the struct
contains a unnamed union
or it is a nested struct
?
My colleagues do sometimes find to_dict
useful for simple plain struct
. Currently we add the methods via _extended.py
.
I think that it's not so easy to design something which has a reasonable behavior w.r.t. all the possible combination of capnproto features. Some random thoughts:
which
key?Void
fields? Do we include them or not?AnyPointer
: how to deal with it?Text
, do we render it as None
or ""
?Text
values: same as above, but in the case the field has a default valueI am sure that depending on the exact use case, you would need slightly different answers to the questions above. So, I am tempted to say that this feature should not be part of the capnpy
core, at least for now.
It would be nice to have it as an external library or plugin: then, as @colinfang says, you can easily integrate inside your schema using *_extended.py
.
Well...I wrote this without noticing your replies...
I haven't really used these advanced features of capnp yet. Will have a look tomorrow~
I've investigated the issue a bit more, here are my thoughts:
First of all, this issue is not about converting the whole capnp data structure into native python types, but about "shallowly" converting to a dict, so nested struct is certainly not considered and most fields don't need to be rendered. Generally, I think such kind of thing cannot and should not be done. For example:
Object = namedtuple('Object', ['dimension', 'weight'])
Dimension = namedtuple('Dimension', ['x', 'y', 'z'])
o = Object(Dimension(10, 15, 20), 50)
o._asdict()
Will the nested Dimension
be converted? No. But the user can always choose to do it himself. Another example:
from types import MappingProxyType
d = {'nested': {'a': 1}, 'b': 2}
m = MappingProxyType(d)
Will m['nested']
become MappingProxyType
? No. But the user can choose to do it with one more line. Also note that namedtuple._asdict()
is indifferent to what the type of the field is, it can be a cffi pointer or whatever.
Secondly, I don't see how *_extended.py
can solve this issue easily. My schema has ~30 fields. Maybe I missed it, I couldn't find a way to get/iterate the field names easily. So to write equivalent methods in *_extended.py
, I have to list all the fields manually, this is unacceptable, given that I have already written a .capnp
file containing all the relevant information.
Handling data consists of (possibly nested) dict/list of primitive types should cover at least 90% usage of a serialization library (which is quite a conservative figure, I would say). I think an api as succinct as possilbe should be provided for such usage. In our app, the serialization layer has a fixed api: dict
<-> bytes
, while the serialization lib can be changed. We have tried quite a few libs, most can do the job in one or two line.
Given the philosophy above, advanced fields that can normally be retrieved from attribute just works. For example group
:
>>> mod = capnpy.load_schema('example_group')
>>> Point = mod.Point
>>> p = Point(position=(3, 4), color='red')
>>> p._fields
('position', 'color')
>>> p._asdict()
OrderedDict([('position', <Point.position: (x = 3, y = 4)>), ('color', b'red')])
named union
:
>>> mod = capnpy.load_schema('example_named_union')
>>> Person = mod.Person
>>> p = Person(name='Bob', job=Person.Job(employer='Capnpy corporation'))
>>> p._fields
('name', 'job')
>>> p._asdict()
OrderedDict([('name', b'Bob'), ('job', <Person.job: (employer = "Capnpy corporation")>)])
There might be some corner cases left to be handled, most notably unnamed union
. Even with a few exceptions, I think this feature is still very useful. The user can always choose to further process the data as needed.
As for unnamed union
, I propose two possible solutions:
Omit unamed union fields in _fields
and _asdict()
. This is the simplest one.
Include the currently set field in the union. This is the arguably more reasonable one, and more close to the definition of 'union'. For example:
@0x8ced518a09aa7ce3;
struct Shape {
area @0 :Float64;
union {
circle @1 :Float64; # radius
square @2 :Float64; # width
}
}
>>> s = Shape(area=20, circle=5)
>>> s._fields
# ('area', 'circle')
>>> s._asdict()
# OrderedDict([('area', 20.0), ('circle', 5.0)])
Note that no matter which one is chosen, the user can always choose to process it further.
Namedtuple has the method
_asdict()
and pycapnp also hasto_dict()
. I think we should also add something equivalent to capnpy, which promptly converts a struct into OrderedDict.