noprompt / meander

Tools for transparent data transformation
MIT License
921 stars 55 forks source link

Preserve visual structure when matching unspecified keys in maps and sets #130

Closed timothypratley closed 3 years ago

timothypratley commented 4 years ago

Problem

Maps and sets with unspecified keys are at a syntactic disadvantage.

Example

The current tools for working with unspecified keys are:

(m/rewrite {:a 1, :b 2}
  {& (m/seqable [!k !v] ...)}
  {& ([!k !v] ...)})
(m/rewrite {:a 1, :b 2}
  (m/map-of !k !v)
  (m/map-of !k !v))

Both visually obscure the structure.

Observe that vectors do not suffer this problem: [!k ...]

Maps and sets are disadvantaged because unspecified keys must only appear in the & clause. This restriction makes sense because keys in maps and sets are unordered, so specified keys do not overlap with unspecified keys.

Motivation

In isolation, one may argue that a single map is easy to understand, but in the face of nested maps, the structure quickly becomes obscured. Consider:

{!a {!b {!c !v}}}
(m/map-of !a (m/map-of !b (m/map-of !c !v)))

The latter is linguistic, the former is visual.

Furthermore, if we had correspondence (see https://github.com/noprompt/meander/issues/129), nested maps transformations would preserve their structure completely.

Possible Solution

Make memory variables imply unspecified keys when specified as a key in a map or set.

(m/rewrite {:a 1, :b 2}
  {!k !v}
  {!k !v})

!k !v implies {& (map-of !k !v)).

Why does this rule make sense?

timothypratley commented 4 years ago

@yuhan0 pointed out (in Slack channel) that {?a 1} is useful, so really only !a expressions could imply unspecified keys. The tradeoff then is the subtlety of different behavior based on the variable type vs the visual syntax

timothypratley commented 4 years ago

Based on Slack discussion, an existing option for picky users like me is:

(defn memory-variable? [v]
  (and (symbol? v)
       (re-matches #"!.+" (name v))))

(def extract-unspecified-kvs
  (s/bottom-up
    (s/rewrite
      {& (m/seqable (m/or [(m/pred memory-variable? !unspecified-ks) !unspecified-vs]
                          !specified-kvs) ...)}
      {& (!specified-kvs ... ['& (`m/map-of !unspecified-ks !unspecified-vs)])}

      ?else
      ?else)))

(defmacro | [& rules]
  `(s/rewrite ~@(map extract-unspecified-kvs rules)))

((| {!k !v}
    {!k (m/app inc !v)})
 {:a 1
  :b 2
  :c 3})

;;=> {:a 2, :b 3, :c 4}

^^ use a macro to replace {!k !v} with (map-of !k !v)

This was offered with big warning signs that I don't fully understand yet :)

timothypratley commented 4 years ago

Another idea floated in Slack: This would be a nice way to be explicit about the behavior without losing the structure: {^… !k !v} and #{^... !k} (i.e. using the metadata style of sets for #{^& ?more}

timothypratley commented 4 years ago

I like this suggestion from @yuhan0 :

"There is already precedence in ..?n for splitting a symbol up into components to bind ?n"

{&... [!k !v]}
;; desugars intuitively to 
{& (m/seqable [!k !v] ...)}
;; or
{&!k !v} ;; implicitly includes !v in the repeating pattern
;; or 
{&!k &!v} ;; over-specification?

#{ &!x }

My favourite is {&!k !v} because it preserves the structure (I can copy-paste example data and replace with variables) and &!k is exactly what I want to do, bind the rest of the kvs as memory variables.