metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.44k stars 206 forks source link

Higher-order function schemas #770

Open julienfantin opened 1 year ago

julienfantin commented 1 year ago

Judging by the repl transcript below, higher-order function schemas are not currently supported.

Is it feasible in the current schema-function/instrumentation design?

clj꞉user꞉> 
(defn adder
  {:malli/schema
   [:=>
    [:cat :int]
    [:=> [:cat :int] :int]]}
  [x] 
  (partial + x))
#'user/adder
clj꞉user꞉> 
(dev/start! {:report (pretty/reporter)})
..instrumented #'user/adder
started instrumentation
nil
clj꞉user꞉> 
((adder 10) :a)
; Execution error (ClassCastException) at user/eval22132 (REPL:39).
; class clojure.lang.Keyword cannot be cast to class java.lang.Number (clojure.lang.Keyword is in unnamed module of loader 'app'; java.lang.Number is in module java.base of loader 'bootstrap')
ikitommi commented 1 year ago

Hi. Just tested, the instrumentation is not recursive at the moment. Should be fixable.

here's a hand-rolled recursion:

(((m/-instrument
   {:schema [:=>
             [:cat :int]
             [:=> [:cat :int] :int]]}
   (fn [x]
     (m/-instrument
      {:schema [:=> [:cat :int] :int]}
      (fn [y] (+ x y)))))
  1) 2)
; 3

(((m/-instrument
   {:schema [:=>
             [:cat :int]
             [:=> [:cat :int] :int]]}
   (fn [x]
     (m/-instrument
      {:schema [:=> [:cat :int] :int]}
      (fn [y] (str (+ x y))))))
  1) 2)
;Execution error (ExceptionInfo) at malli.core/-fail! (core.cljc:136).
;:malli.core/invalid-output {:output :int, :value "3", :args [2], :schema [:=> [:cat :int] :int]}
ikitommi commented 1 year ago

so, the instrumentation should look for all child schemas and instrument those too. There is m/walk to do this easily, but what would be the default here?

  1. no recursion (fast instrumentation, not complete, same as with :=> validation, by default just checks it's a fn?. With ::m/function-checker can be configured to use test.check to do proper generative testing on the function.
  2. recurse by default. much slower but would support HOFs.

Maybe start with option 1?

julienfantin commented 1 year ago

How about using options to opt-in?

(defn adder
  {:malli/schema
   [:=> {:recur true}
    [:cat :int]
    [:=> [:cat :int] :int]]}
  [x] 
  (partial + x))
ikitommi commented 1 year ago

That's a good option. One more place to do this would be:

(defn adder
  {:malli/schema
   [:=>
    [:cat :int]
    [:=> [:cat :int] :int]]
   :malli/recur true}
  [x] 
  (partial + x))
ikitommi commented 1 year ago

Would you be interested in doing a PR for this? I think the logic would go into m/-instrument + m/-function-info (which could expose if this should recur or not).

ikitommi commented 1 year ago

I think you proposal of using {:recur true} schema properties could be the starting point here. Need to think it over before committing to that, but the actual implementation is about the same regardless how it's enabled (and if it's on by default).

ikitommi commented 1 year ago

Generator already recur (as expected):

(((mg/generate
   [:=>
    [:cat :int]
    [:=> [:cat :int] :int]]) 1) "2")
;Execution error (ExceptionInfo) at malli.core/-fail! (core.cljc:136).
;:malli.core/invalid-input {:input [:cat :int], :args ["2"], :schema [:=> [:cat :int] :int]}
julienfantin commented 1 year ago

Would you be interested in doing a PR for this?

I can give it a shot

I think the logic would go into m/-instrument + m/-function-info (which could expose if this should recur or not).

Thanks

I think you proposal of using {:recur true} schema properties could be the starting point here. Need to think it over before committing to that, but the actual implementation is about the same regardless how it's enabled (and if it's on by default).

Having second thoughts on that, because if someone bothers writing a higher-order function schema instead of justing using ifn? surely they mean for the schema to be recursing?

It's the effects that have to be controlled, perhaps as an option to -instrument?

julienfantin commented 1 year ago

Supporting the initial example in -instrument is actually pretty simple:

@@ -2481,9 +2481,10 @@
                      (report ::invalid-input {:input input, :args args, :schema schema})))
                  (let [value (apply f args)]
                    (when wrap-output
-                     (when-not (validate-output value)
-                       (report ::invalid-output {:output output, :value value, :args args, :schema schema})))
-                   value))))
+                     (cond
+                       (not (validate-output value)) (report ::invalid-output {:output output, :value value, :args args, :schema schema})
+                       (= :=> (type output)) (-instrument (assoc props :schema output) value options)
+                       :else value))))))
        :function (let [arity->info (->> (children schema)
                                         (map (fn [s] (assoc (-function-info s) :f (-instrument (assoc props :schema s) f options))))
                                         (-group-by-arity!))

At this point, value is the returned fn we need to instrument, and output is its schema. Recursing with -instrument works and we can "nest" them arbitrarily.

((((mi/-instrument
     {:schema [:=>
               [:cat :int]
               [:=>
                [:cat :int]
                [:=>
                 [:cat :int]
                 :int]]]}
     (fn [x]
       (fn [y]
         (fn [z]
           (+ x y z)))))
   1) 2) :_)
1. Unhandled clojure.lang.ExceptionInfo
   :malli.core/invalid-input {:input [:cat :int], :args [:_], :schema [:=> [:cat
   :int] :int]}
   {:data {:args [:_],
           :input #<malli.core$_sequence_schema$reify$reify__28544@3df5480e>,
           :schema #<malli.core$__EQ__GT$reify$reify__28479@6b1c1fa1>},
    :message :malli.core/invalid-input,
    :type :malli.core/invalid-input}

But in this example the return value is the fn we need to instrument so it's basically the simplest recursion scheme.

Things get more complicated if, for example, we consider wrapping the fn inputs. Now we need to: go through the args – along with their input schema – and instrument all the :=> we find before putting the args back together and finally invoking the fn.

If we could make this process more generic and instead recursively traverse the structure (the args and return value) – along with their input schemas (the input or output) – then we would be able to instrument high-order functions anywhere in the inputs or outputs, which would be pretty cool.

This sounds a lot like what transformers do, but I could not manage for the handlers in :enter and :leave to receive both the values and their schemas. I understand this is a bit outside of the normal use-case for transformers, but it seems pretty close.

ikitommi commented 1 year ago

Sadly, transformers don't receive both schema & value atm. Did a spike on adding schema to transforming functions some time ago, but was a big change and did not finish.

As the simple solutions looks really simple, I propose let's merge that in, better than current, just needs documentation that it is not recursive. What do you think?

Also, function can be :function or :=>, there is already a (#{:=> :function} t) code. Maybe extract that info -function-schema-type? and reuse that here too?