plumatic / schema

Clojure(Script) library for declarative data description and validation
Other
2.4k stars 256 forks source link

[Sketch] Polymorphic fn schemas #444

Closed frenchy64 closed 1 year ago

frenchy64 commented 2 years ago

Old sketch I did in October 2021 to bring polymorphic fns to schema. Could be interesting to flesh out. Feedback/encouragement welcome.

w01fe commented 2 years ago

Thanks for sharing. Can you say a little more about the problem this solves please, e.g. an example use case and explanation of when one would prefer to use =>/fn over s/fn?

frenchy64 commented 2 years ago

Yes, this was unceremoniously dumped here, I'll try and fill in what I was thinking.

e.g. an example use case and explanation of when one would prefer to use =>/fn over s/fn

Ah, let's just ditch =>/fn and say that s/fn is enhanced with the following equivalence:

ie., (s/fn :=> (=> A B) [b] ...) <=> (s/fn :- A [b :- B] ...)

Can you say a little more about the problem this solves please

The main rationale is increasing the precision that schema can document, generate, instrument, and check "helper" functions. To do this, the sketch adds a new schema s/all that can stand for many FnSchema's at once.

The simplest example is clojure.core/identity. Currently you'd write (s/=> Any Any). With s/all you'd write (s/all [x] (=> x x)), or more precisely:

(s/defn identity :=> (s/all [x] (=> x x))
  [x] x)

We've improved the documentation precision, but what about the other goals? Here are some possibilities:

To instrument (ie., check usages): replace x with Any and instrument as (=> Any Any).

To check (ie., check impl) or generate functions like identity: replace x with (s/enum ::gensym) and generatively test as (=> (s/enum ::gensym) (s/enum ::gensym)).

More involved examples might be:

(s/defn combine-a-and-b
  :=> (s/all [x] (=> [x] {:a [x] :b [x]}))
  [{:keys [a b]}]
  (concat a b))

or even:

(s/defn map-and-drop
  :=> (s/all [x y] (=> [y] (=> y x) [x])))
  [f coll]
  (->> coll (map f) (drop 10)))

In this sketch, you could call (=>/check map-and-drop) and it would automatically check the function against schemas like (=> [Int] (=> Int Bool) [Bool]), or (=> [(s/enum ::Y)] (=> (s/enum ::Y) (s/enum ::X)) [(s/enum ::X)]) by generating the args and checking the return.

This last point IMO is the killer feature, at least in terms of providing a way of verifying helper functions beyond unit tests.

frenchy64 commented 2 years ago

I just added another sketch schema.fn-schema2 I did around the same time that adds more ops to manipulate FnSchemas. It's related to the (s/fn :=> (=> A B) ...) idea as you can better reuse fn schemas to pass to :=> (sort of like s/merge's role but for FnSchema)---adding ops like pushing or dropping args to existing schemas.

I wasn't totally happy with the result (I forget why, perhaps I wanted to generalize it to intersections of functions, or find a better way to identify and manipulate rest arg schemas), but that namespace can help reuse FnSchemas like this:

(defschema TakesDB (=> Out DB Arg1 Arg2))
(defschema DBFilled (drop-leading-args TakesDB 1))

(defn fill-db :- DBFilled
  [f :- TakesDB, db :- DB]
  (s/fn :=> DBFilled [arg1 arg2] (f db arg1 arg2))

I guess drop-leading-args is just partial in the original schema.fn-schema sketch, but there are other ops in schema.fn-schema2.

w01fe commented 2 years ago

Thanks, that's very helpful!

I agree that it would be useful to have parameterized type signatures for functions. But I'm curious about how that's coupled to the question of syntax (using a separate function schema rather than annotating inline).

For example (the first idea that came to mind) something like

(s/defn ^(s/params 'T) identity :- T 
    [x :- T] 
    x)

would be more uniform with the existing syntax. Or am I missing a reason that the separate-schema syntax is fundamentally necessary to support type parameters?

Even if they are independent, I can see reasons why one might prefer a separate-schema syntax, but curious if these two aspects of the proposal can be pulled apart.

Thanks again, Jason

frenchy64 commented 2 years ago

Yes, you're on the money. The syntax can be developed separately. If this were to land in schema I'd expect something more like what you're proposing. For comparison, this is the syntax used in Typed Clojure for this:

(t/defn :forall [T]
  identity
  [x :- T] :- T 
  x)
w01fe commented 2 years ago

Ah ok, cool. I'm definitely on board with supporting parameterized functions, don't have a strong opinion about syntax, I like the typed clojure version too.

For applying a function schema to a function, I don't love all the boilerplate needed to take the schema back apart and match it with the function, but could see it being worthwhile -- would just want to discuss the potential pros and cons.

frenchy64 commented 2 years ago

Yes, I guess there's at least 3 ideas here:

  1. s/all
  2. Syntax for macros to create s/all
  3. Meta operations on FnSchema

Number 1 is certainly most compelling to me, and probably requires 2 in practice. 3 can be pulled out entirely of this discussion, and may be more relevant once s/defprotocol is merged (eg., stubbing out the first arg of a method without having to write the entire schema again).

-------- Original Message -------- On Aug 11, 2022, 13:24, Jason Wolfe wrote:

Ah ok, cool. I'm definitely on board with supporting parameterized functions, don't have a strong opinion about syntax, I like the typed clojure version too.

For applying a function schema to a function, I don't love all the boilerplate needed to take the schema back apart and match it with the function, but could see it being worthwhile -- would just want to discuss the potential pros and cons.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

w01fe commented 2 years ago

Sounds right to me. Now looking at the implementation again, I don't understand the :nat thing, can you say more about that please?

frenchy64 commented 2 years ago

Sounds right to me. Now looking at the implementation again, I don't understand the :nat thing, can you say more about that please?

That's the "kind" of the variable. Variables like T in (s/all [T] (s/=> T T)) are of kind "schema" (sorta like (s/protocol s/Schema)). There I was experimenting with kind for natural numbers, that's useful for making "dependent" schemas (like dependent types).

There are other more relevant kinds that I'm working on first (functions, rows of schemas), maybe I'll figure out how to use :nat in practice but for now it's not obvious.

I'm coding something up with docs that will explain kinds.

EDIT: see https://github.com/plumatic/schema/pull/445

frenchy64 commented 1 year ago

Moved https://github.com/frenchy64/schema-incubator