Open mathrick opened 3 years ago
Ah, Fill
. The simultaneously more minimalist and more advanced templating mode.
Fill's intended solution to clean that up is to use T
:
target = [{'a': 1, 'b': 1}, {'a': 2, 'b': 2}]
glom(target, [Fill((T['a'], T['b']))]) == [(1, 1), (2, 2)]
>>> True
Fill
is meant to be more precise, so it pairs better with T
. You only need to use Path
if you want to retain the automatic transparent access to both object attributes, dict keys, and list indexes.
So, for context, glom
started out as a pretty opinionated library, before growing the modes layer that allows different opinions to coexist, e.g., Fill
vs Auto
. So I think one design goal of modes is that it should be straightforward to derive a Fill
mode that interprets strings as paths, as you've suggested, above.
There's a short doc on the topic. Quickest way I can think to do it would be to inherit (or compose a new object based on) Fill
and override _fill
to have the string behavior you desire, and call through to super()
for the rest. What do you think?
There's a short doc on the topic. Quickest way I can think to do it would be to inherit (or compose a new object based on) Fill and override _fill to have the string behavior you desire, and call through to super() for the rest. What do you think?
Right, the doc and the changelog is where I read about Fill
at first, and my understanding of "prefers to fill containers instead of iterating / chaining" did not also include changing how strings are interpreted :). I do think it'd be possible to implement a version that works more like what I expected, but I also wanted to discuss whether Fill
's current behaviour is truly the best default.
Whilst T['a']
is a solution (and I did actually mention it in #207), I think treating bare strings as paths, just as in Auto
, would be more useful. The primary need that isn't cleanly addressed in Auto
is "how do I create output values of container types that glom interprets as chained specs?". That's what I expected Fill
to do. But the question of "how do I create a literal string output value, instead of having it interpreted as a path?" is already answered by Val()
, so there's no need to have Fill
do that, especially when it arguably reduces the expressiveness of the container-value creation.
Of course, since Fill
has been released in its current form, its behaviour will need to stay as is for backwards compatibility. But I think it'd be useful to have another stock mode that is more selective in what it changes. It could either go back to being called Template
, or maybe something like Struct
. Would you like me to prepare a PR?
To give some more context on Fill
-- it was built around the use case of filling in configuration data structures
here's the (slightly sanitized) prototypical example
Fill({
'debug': T['debug'],
'xmlsec_binary': saml2.sigver.get_xmlsec_binary([
'/opt/local/bin', '/usr/bin/xmlsec1', '/app/vendor/xmlsec1/bin/']),
'entityid': _Format('{base_url}/metadata/'),
'description': 'IdP',
'service': {
'idp': {
'name': 'IdP',
'endpoints': {
'single_sign_on_service': [
(_Format('{base_url}/sso/post/'), saml2.BINDING_HTTP_POST),
(_Format('{base_url}/sso/redirect/'),
saml2.BINDING_HTTP_REDIRECT),
],
'single_logout_service': [
(_Format('{base_url}/slo/post/'), saml2.BINDING_HTTP_POST),
(_Format('{base_url}/sso/redirect/'),
saml2.BINDING_HTTP_REDIRECT)
]
},
'name_id_format': [
saml2.saml.NAMEID_FORMAT_EMAILADDRESS,
saml2.saml.NAMEID_FORMAT_UNSPECIFIED],
'sign_response': False,
'sign_assertion': True,
"signing_algorithm": SIG_RSA_SHA256,
"digest_algorithm": SIG_RSA_SHA256,
},
},
'metadata': {
'remote': lambda t: [ # not safe to do DB @ import so defer till format
{'url': entity} for entity in db_results()]
},
# Signing
'key_file': _Format('{cert_dir}/private.key'),
'cert_file': _Format('{cert_dir}/public.cert'),
# Encryption
'encryption_keypairs': [{
'key_file': _Format('{cert_dir}/private.key'),
'cert_file': _Format('{cert_dir}/public.cert'),
}],
'valid_for': 365 * 24,
})
in addition to strings, other non-data-structure objects like int
and bool
which would be an error in Auto
mode will pass through to the result
just to see how it looks, here's the example with Val
wrapped around all the plain values
Fill({
'debug': T['debug'],
# xml_sec needs a buildpack on HEROKU to work
'xmlsec_binary': Val(saml2.sigver.get_xmlsec_binary([
'/opt/local/bin', '/usr/bin/xmlsec1', '/app/vendor/xmlsec1/bin/'])),
'entityid': _Format('{base_url}/metadata/'),
'description': Val('EBill/Tableau IdP'),
'service': {
'idp': {
'name': Val('Ebill IdP'),
'endpoints': {
'single_sign_on_service': [
(_Format('{base_url}/sso/post/'), Val(saml2.BINDING_HTTP_POST)),
(_Format('{base_url}/sso/redirect/'),
Val(saml2.BINDING_HTTP_REDIRECT)),
],
'single_logout_service': [
(_Format('{base_url}/slo/post/'), Val(saml2.BINDING_HTTP_POST)),
(_Format('{base_url}/sso/redirect/'),
Val(saml2.BINDING_HTTP_REDIRECT))
]
},
'name_id_format': [
Val(saml2.saml.NAMEID_FORMAT_EMAILADDRESS),
Val(saml2.saml.NAMEID_FORMAT_UNSPECIFIED)],
'sign_response': Val(False),
'sign_assertion': Val(True),
"signing_algorithm": Val(SIG_RSA_SHA256),
"digest_algorithm": Val(SIG_RSA_SHA256),
},
},
'metadata': {
'remote': lambda t: [ # not safe to do DB @ import so defer till format
{'url':entity} for entity in db_results()]
},
# Signing
'key_file': _Format('{cert_dir}/private.key'),
'cert_file': _Format('{cert_dir}/public.cert'),
# Encryption
'encryption_keypairs': [{
'key_file': _Format('{cert_dir}/private.key'),
'cert_file': _Format('{cert_dir}/public.cert'),
}],
'valid_for': Val(365 * 24),
})
this is totally a matter of subjective perception, but to me when I flip the screen back and forth, that extra level of (
... )
around long-ish values like saml2.saml.NAMEID_FORMAT_EMAILADDRESS
really slows down comprehending the structure
I see. Looking at it in a long structure that's heavily nested and mostly literal, I can see why it was done this way.
One thing that occurs is maybe a way to make this behavior transparent and overridable.
Maybe we could have a "paint-by-number" Mode where you simply override the types that you want to. Fill
could be a good candidate for the basis of that template -- "for low level built in python data structures, recurse; for others pass through".
I'm imagining an API something like:
Fill(overrides={str: Path}, ...)
There's this idea hanging out that, EVENTUALLY, we will need to switch to "compile" glom-specs. The compilation (glompilation heh) would convert all the specs to more concrete primitives. This would improve performance by getting rid of a layer of dispatch logic, and eventually we could push the primitive callbacks down into Cython. Once we get there, the idea would be for glom to dogfood / spiral itself up by using gloms to transform specs.
In that general direction, you could imagine a transformer (glom-macro or glomacro if you will):
def str2path(spec)
return glom(spec,
Ref('spec',
Match(Switch({
dict: {T: Ref('spec')},
list: [Ref('spec')],
str: Invoke(Path).specs(T)}))))
some challenges with this approach as-is:
Maybe there should be a "mirror image" of Val()
-- whereas Val
says "treat this as a plain value, do not glom" this other mechanism would say "run this through an extra layer of glomming"
class Macro:
def __init__(self, macro_spec, spec):
self.macro_spec, self.spec = macro_spec, spec
def glomit(self, target, scope):
spec = scope[glom](self.macro_spec, self.spec, scope)
return scope[glom](spec, target, scope)
Hmm.... it gets super ugly when I try to use it thought
Fill(Macro(Ref('macro', Match(Switch({str: Invoice(Path).specs(T), tuple: [Ref('macro')])))
expanding out the tuple and making it work properly is way too complicated -- tuple-to-list, apply macro, list-to-tuple
This is related to #207. Basically, I don't think
Fill
does what I expected it to do after reading the docstring, and I also don't think it's the most useful mode in its current incarnation. Specifically, the mode it introduces is too literal. In the example in #207, I'd expect the following to be the case:In reality though, I have to mark 'a' and 'b' with
Path
explicitly, which is not at all obvious from the description of Fill: "A variant of the default transformation behavior; preferring to “fill” containers instead of iterating, chaining, etc.". There are no containers here, so I'd expect'a'
and'b'
to be interpreted as paths still, whilst "structural" specs get intepreted as templates for containers to create. But it seems that what it really means is "everything exceptPath
andSpec
becomes a literal". This makes the resulting spec so verbose and hard to read for what it does, I might as well not bother withFill
: