m2ym / optima

Optimized Pattern Matching Library for Common Lisp
271 stars 19 forks source link

Introducing binding-pattern #109

Closed guicho271828 closed 9 years ago

guicho271828 commented 9 years ago

introduction

binding-pattern (<> pattern arg value) works just as let. the value to be matched is put into arg. value evaluates and returns some value, then the return value is matched against pattern.

Consider you want to write a query that matches the type specifier function, and destruct it into the types of arguments and the types of return values.

(defpattern function-type (argument-types return-type)
  `(list 'function ,argument-types ,return-type))

(ematch '(function () fixnum)
   ((function-type args values) values)))
;; --> 'fixnum

This is not sufficient, because Common lisp allows "dropping" the type specifier arguments. e.g. function, (function), (function ()), (function () (values &rest *)) are all valid type specifiers.

(ematch 'function
   ((function-type args values) values)))
; --> error

Ideally, above code should work as below, since the default type for function return value is *.

(match 'function
   ((function-type args values) values)))
; -->  '*

background

One currently approved methodology to achieve this effect is to use the optima's internal functions and structures. To implement such a pattern, we have to implement all these structure and methods below. THIS IS AWFUL.

(defstruct (function-type-pattern
            (:include constructor-pattern)
            (:constructor make-function-type-pattern
                          (arguments value &aux (subpatterns (list arguments value)))))
  arguments value)

(defmethod constructor-pattern-destructor-sharable-p
    ((x function-type-pattern) (y function-type-pattern))
  t)

(defmethod constructor-pattern-make-destructor
    ((pattern function-type-pattern) matched)
  (with-slots (arguments value) pattern
     (make-destructor
      :bindings <<too complecated WTF>>
      :predicate-form <<too complecated WTF>>
      :accessor-forms <<too complecated WTF>>)))

(defmethod unparse-pattern ((pattern function-type-pattern))
  (with-slots (arguments value) pattern
     `(function-type ,arguments ,value)))

(defmethod parse-constructor-pattern ((name (eql 'function-type)) &rest args)
  (match args
    ((list arguments value)
     (make-function-type-pattern (mapcar #'parse-pattern arguments) (parse-pattern value)))
    (otherwise
     (error "Bad function-type pattern: ~S" (list* name args)))))

THIS IS AWFUL.

binding-pattern

Now we introduce a binding pattern. It can bind arbitrary value obtained from the matched element, then assign it against the subpattern.

(defpattern function-type (argument-types return-type)
  (let ((x gensym))
  `(or (and 'function (<> ,argument-types ,x '(&rest *)) (<> ,return-type ,x '*))
         ; x is not used, since it binds a constant
       (and (list 'function ,argument-types) (<> ,return-type ,x '*))
       (and (list 'function ,argument-types ,return-type)))

(match 'function
   ((function-type args values) values)))
; --> '*

This is a refinement of the previous posts about accessor-pattern in #90. for example, cons pattern, currently implemented with native constructure-pattern, can be implemented as below:

(defpattern cons (a b)
  (let ((x gensym))
    `(and (type cons)
          (<> ,a ,x (car ,x))
          (<> ,b ,x (cdr ,x)))))

Related works

Providing accessors and use the feature in optima that parse the structure pattern prefix like (foo- bar) and (defun foo-bar ()...) and (defun foo-p () ...) is another way around, but binding-pattern supersedes this idea. By separating foo-p and foo-bar, it only results in redundunt and inefficient code. Consider the following example:

(defun function-type-p (type)
  (or (eq 'function type)
      (and (cons type)
           (eq (first type) 'function)
           ...)))

(defun function-type-arguments (type)
  (cond
    ((eq 'function type) '*)
    ((and (cons type)
          (eq (first type) 'function)
           ...))))

It runs the same destructuing twice and ineffficient. The approache should be in reverse --- Actually, it should be the pattern that defines an accessor. For example, using a macro like defpattern-with-accessors, we can write:

(defpattern-with-accessors function-type (args-types return-type)
  ...)

;; -->

(PROGN
  (DEFPATTERN FUNCTION-TYPE
      (&OPTIONAL (ARGS-TYPES '_) (RETURN-TYPE '_))
    `(OR (AND 'FUNCTION (<> ,ARGS-TYPES _ '(&REST *)) (<> ,,RETURN-TYPE _ '*))
         (OR (AND (LIST 'FUNCTION ,ARGS-TYPES ,RETURN-TYPE)) (AND (LIST 'FUNCTION ,ARGS-TYPES) (<> ,RETURN-TYPE _ '*))
             (AND (LIST 'FUNCTION) (<> ,ARGS-TYPES _ '(&REST *)) (<> ,RETURN-TYPE _ '*)))))
 (DEFUN FUNCTION-TYPE-ARGS-TYPES (#:OBJ1737)
   (MATCH #:OBJ1737
     ((FUNCTION-TYPE ARGS-TYPES _) ARGS-TYPES)))
 (DEFUN FUNCTION-TYPE-RETURN-TYPE (#:OBJ1737)
   (MATCH #:OBJ1737
     ((FUNCTION-TYPE _ RETURN-TYPE) RETURN-TYPE)))
 (DEFUN FUNCTION-TYPE-P (#:OBJ1737)
   (MATCH #:OBJ1737
     ((FUNCTION-TYPE) T)
     (_ NIL))))

https://github.com/Bike/compiler-macro provides sufficiently smart type extraction, and I am refactoring their functions with optima. However, given that lots of custom patterns should be written using optima's internal structure, for each of type specifiers in common lisp, like array, (array fixnum (*)) etc, it is almost impossible to achieve this without binding-type.

m2ym commented 9 years ago

I think this can be achieved by generalizing guard pattern. One possible generalization could be extending guard pattern (guard subpattern test-form) can take another optional subpattern to match the result of TEST-FORM, like (guard subpattern1 form subpattern2). So (guard subpattern test-form) is equivalent to (guard subpattern test-form (not null)).

guicho271828 commented 9 years ago

Sounds interesting, I interpreted your idea as something like this:

(match 'function
   ((guard notused '* x) x))) ; --> '*

but I'm afraid that it would be not as intuitive as the original guard pattern. test-form of guard pattern is supposed to return a boolean.

Also, it does not allow more than 2 subpatterns without consing. Suppose mathing against array type specifier:

(match 'array
   ((guard notused (list '* '*) (list element-type dimensions))
;;;                 ^^^^^^^^^ this is consing!
    (vector element-type dimensions))) ;; -> #(* *)

;;; we can also do this hack below, but it doesnt look good
(match 'array
   ((and (guard notused '* element-type)
             (guard notused '* dimensions))
    (vector element-type dimensions))) ;; -> #(* *)

Instead, how about this alternative: (guard subpattern test-form {generator-form subpattern}*)

(match 'function
   ((guard x (eq x 'function) '* y) (vector x y)))) ; --> #(function *)

;; cons-pattern impl
(match '(1 2)
   ((guard x (consp x)
                  (car x) car
                  (cdr x) cdr) (vector car cdr)))) ; --> #(1 2)

;; well, with <>-pattern, above guard pattern is equivalent to
(match '(1 2)
   ((and (guard x (consp x))
            (<> car x (car x))
            (<> cdr x (cdr x))) (vector car cdr)))) ; --> #(1 2)
guicho271828 commented 9 years ago
(match 'array
   ((guard 'array t '* element-type '* dimensions)
    (vector element-type dimensions))) ;; -> #(* *)
guicho271828 commented 9 years ago

BTW, good to know that the 1st argument of guard pattern is a subpattern, not merely a variable. You can write the code like below even now on?

(guard (cons a b) (= (* a 2) b))