Closed ssato closed 7 years ago
Partially improve by the commit b89757d.
This may damage the flexibility of the config parsing. Here is my sample use-case:
I want multiple configs, including YAML, XML, INI configs to be flattened to a unified data structure, with the keys & values (and sometimes lists). The XML parser returns those @text
, @attrs
, @children
keys instead of the desired values. Consider this config:
<config>
<layer mode="somemode">kvm</layer>
<env mode="static">
<server ip="1.2.3.4"/>
<server ip="2.3.4.5"/>
</env>
</config>
The resulting structure cannot be flattened by merge_dicts=True
option, because there are both texts, attrs & children, and the existing implementation explicitly avoid flattening in this case.
So I made my own dict-like object to flatten all keys (in __setitem__
), so that the attrs & the children are merged as equal. The text & attrs case is, however, undecided how to handle, but probably the text part will be mapped to some predefined key. The desired result is:
{
'config': {
'layer': {
'mode': 'somemode',
'name': 'kvm', # <-- text content
},
'env': {
'mode': 'static',
'pserver': [
...
],
},
}
}
How am I supposed to handle that if to_container
is removed? Should I just post-process the results, assuming they have came from different sources (YAML vs XML), just by looking at the fields names.
E.g., if I see @attrs
, then flatten them to the parent dict. In that case, the precedence of multiple config files can be violated (e.g., a low-priority file has attr <env mode=static></env>
set, while the high-priority overrides it with a tag <env><mode>static</mode></env>
— but we notice the @attrs
and apply a low-priority value on top of the high priority value).
The self-made container could solve this problem by applying the changes on parse time of each single file.
PS: Also, to_container
is currently broken and ignored, so it does not work. I've came here to report this, but I see that the option is going to be removed at all.
Thanks a lot for your comments!
Actually, what you pointed at may be related to what I want to resolve in this issue, I guess.
I guess that especially the last one (ac_container/ac_dict) might help resolving your issue.
The XML parser returns those @text, @attrs, @children keys instead of the desired values.
This is intended because I think there is no way to implement mode does flatten them regardless of these types in XML backend generally. For example, I don't think there is a way to flatten the XML snippet in a consistent manner such like this:
<layer mode="somemode">kvm <options nested='true'/></layer>
So, IMHO, you have to modify/convert the nested dict loaded from configuration files anyway. Maybe query with ac_query (JMESPath expression) might help doing such work.
This is why I started to invent my own to_container
class, which flattens the data according to my specific (non-generalised) case. This container delegates all @attr
& @children
assignments to the parent container (taking into account that you create 2 distinct instances in the anyconfig code). The conflicting @text
case is ignored for now (solved with merge_attrs=True
for my case). Something like this:
import collections
class XMLContainer(collections.MutableMapping):
def __init__(self, *args, **kwargs):
super(XMLContainer, self).__init__()
self._data = {}
self._parent = None
def __setitem__(self, key, val):
if key in {'@text', '@attrs', '@children'} and isinstance(val, XMLContainer):
val._parent = self
# NB: do not set it locally. assume it will delegate all sets to us while it exists.
elif self._parent is not None:
self._parent[key] = val
else:
self._data[key] = val
def __delitem__(self, key):
if key in {'@text', '@attrs', '@children'}:
pass
else:
del self._data[key]
def __getitem__(self, key):
if key in {'@text', '@attrs', '@children'}:
return self
else:
return self._data[key]
def __iter__(self):
return iter(self._data)
def __len__(self):
return len(self._data)
@nolar I've just released 0.9.0 which removes dependency to m9dicts along with maybe-correct implementation to allow passing user-defined mapping class to backend parsers directly to make results on load data. I guess that now you can accomplish what you want by passing XMLContainer as ac_dict keyword option to load APIs.
I'll close this issue but if you have any problem related, could you please open another issue for that?
@nolar BTW, did you try keyword options available in XML backend such like ac_parse_value=True, merge_attrs=True, etc? (Maybe I should implement an option to eliminate '@children' as much as possible also...)
@ssato It made it a bit easier to work with, but did not solve my problems. I therefore created a new separate issue for my problems: #73
To reduce strong dependency to m9dicts, remove to_container arguments and make it works only with standard library as much as possible.