swaroopch / edn_format

EDN reader and writer implementation in Python, using PLY (lex, yacc)
https://swaroopch.com/2012/12/24/edn-format-python/
Other
131 stars 31 forks source link

Load edn as a Python dict with string keys #76

Open piotr-yuxuan opened 4 years ago

piotr-yuxuan commented 4 years ago

Would you consider a PR to load edn form just as standard Python dict with string keys?

bfontaine commented 4 years ago

I’ll let @swaroopch weight in but personnally I’d be interested in some option to edn_format.loads such as string_keys=True or string_keyword_keys=True that would parse all keyword keys as strings.

piotr-yuxuan commented 4 years ago

Indeed yes that sounds reasonable.

swaroopch commented 4 years ago

Agree with @bfontaine ! Thanks @piotr-yuxuan :-)

piotr-yuxuan commented 4 years ago

I see two issues here:

These two points are very opinionated, and perhaps you will disagree with me. I would prefer to know the opinion of the maintainers about them before refactoring everything and making breaking changes ;-) If we were to move forward, I would see four possible different ways to parse an edn keyword, and two possible ways to parse a symbol:

Methinks a naive implementation for Keyword which store the parsing strategy as a lexer state would be:

@ply.lex.TOKEN(KEYWORD)
def t_KEYWORD(t):
    if t.lexer.keyword_parsing == KeywordParsing.keyword_object:
        t.value = Keyword(t.value[1:])
    elif t.lexer.keyword_parsing == KeywordParsing.name_string:
        t.value = Keyword(t.value[1:]).name
    elif t.lexer.keyword_parsing == KeywordParsing.ns_slash_name_string:
        t.value = t.value[1:]
    elif t.lexer.keyword_parsing == KeywordParsing.leading_colon_string:
        pass
    return t

…which leads to the remarks above.

piotr-yuxuan commented 4 years ago

For the impatient, here how I would iterate through what is returned by edn_format:

keyword_name = lambda keyword: \
    keyword.name.split("/", 1)[1] if "/" in keyword.name else keyword.name
strinfigy_keyword = lambda x: \
    keyword_name(x) if isinstance(x, edn_format.Keyword) else x

def process_edn(x):
    if isinstance(x, edn_format.ImmutableDict):
        return {strinfigy_keyword(k): process_edn(v) for k, v in x.items()}
    elif isinstance(x, edn_format.ImmutableList):
        return [process_edn(v) for v in x]
    else:
        return strinfigy_keyword(x)

Obviously keyword_name can be changed for any other callable.

bfontaine commented 4 years ago

Hello, sorry for the late response.

That’s interesting; I was thinking of post-processing the result of the parsing like you did in your second comment; not of doing that during the lexing/parsing.

Currently Keyword('a/b').name => 'a/b'

Yes, it has bit me before. I don’t know the reason behind this. Changing this would be a breaking change, though.

The way namespaced maps are done in the parser, and not in the lexer, further complicates any attempt to stringify keys.

I don’t understand how you would do that in the lexer; its job is to tokenize the input, not to make sense of those tokens. The lexer doesn’t know what a map looks like: in #:foo {:a 1}, #foo, {, :a, 1, and } are different tokens.

mohkale commented 4 years ago

Just thought it was worth mentioning that you can create a tag with @piotr-yuxuan solution. Which you can add at the top of your edn file if you don't want to actively call process_edn after parsing. Something like:

edn.add_tag('python', process_edn)

and then you can:

#python
{:tags
 [{:name "foo"},
  {:name "bar"} ]}
tzzh commented 3 years ago

Starting from @piotr-yuxuan 's suggestion I used

def edn_to_map(x):
    if isinstance(x, edn_format.ImmutableDict):
        return {edn_to_map(k): edn_to_map(v) for k, v in x.items()}
    elif isinstance(x, edn_format.ImmutableList):
        return [edn_to_map(v) for v in x]
    elif isinstance(x, edn_format.Keyword):
        return x.name
    else:
        return x

it's incomplete but it seems ok so far in my case (parsing simple edns to python dicts without keywords), so posting it here in case that's helpful for someone else.