swaroopch / edn_format

EDN reader and writer implementation in Python, using PLY (lex, yacc)
https://swaroopch.com/2012/12/24/edn-format-python/
Other
131 stars 31 forks source link

Correctly handle `#_` #43

Closed bfontaine closed 6 years ago

bfontaine commented 6 years ago

The current #_ handling only works on simple expressions such as #_foo. It doesn’t work if there’s a space between #_ and the discarded element, nor with more complex expressions such as:

#_ [1 2 3]
[1 2 3 #_ #_ 4 5]
{:a #_ #inst "2018" :b}

This PR fixes that. It parses #_ as a token on its own then use specific rules to discard expressions.

Note: I added a lot of tests but I may have forgotten some weird edge-cases.

I also changed the code to reject tags that don’t start with a letter such as #1abc "foo", because they aren’t valid EDN.

This fixes #4.

bfontaine commented 6 years ago

Note this doesn’t work with top-level #_ usage:

;; fails
foo #_ bar

;; works
[foo #_ bar]

We may want to fix this later.

swaroopch commented 6 years ago

@bfontaine Note this doesn’t work with top-level #_ usage - is this valid syntax though?

bfontaine commented 6 years ago

Note this doesn’t work with top-level #_ usage - is this valid syntax though?

The spec is not clear whether we can or not, but clojure.edn accepts it. It also accepts empty files even if we don’t know if it’s valid EDN.

There are three basic cases I thought about:

  1. #_ value → raise an EOF error
  2. #_ value1 value2 → parse as value2
  3. value1 #_ value2 → parse as value1

I haven’t tested it yet but I guess changing the grammar like this should work:

root_expression = expression
                | discarded_expression
                | discarded_expressions expression
                | expression discarded_expressions
bfontaine commented 6 years ago

You were right to be suspicious: after looking at Hy’s implementation, I rewrote the code and it’s now a lot more concise. 🎉

swaroopch commented 6 years ago

Yay! Thanks @bfontaine