Consider Merging Predicate and Type Schemas

rschmukler commented 3 years ago

Right now type schemas and predicate schemas are entirely different schemas and don't behave the same. This can create some surprising behaviors:

Writing a transformer for one doesn't target the other. eg. Writing a transformer for :string won't apply to string?. This means that transformer authors either need to provide duplicate entries for the encoders and decoders, or that application authors need to be consistent in which schema annotation they use.
Similarly, writing a schema walker requires knowing and handling each variant.
Malli as a library suffers from the same potential for inconsistency, as seen between the property-based validation available on :string and :int, but not held for predicate schemas.

(m/validate [:string {:min 5 :max 10}] "foo")
;; => false
(validate [string? {:min 5 :max 10}] "foo")
;; => true

(validate [:int {:min 5 :max 10}] 3)
;; => false

(validate [int? {:min 5 :max 10}] 3)
;; => true

A potential solution to this is having the predicate-schemas just be aliases to the type schemas in the default registry, eg:

{string? :string
 'string? :string
 vector? :vector
 'vector? :vector
 'pos-int? [:int {:min 1}]
 pos-int?  [:int {:min 1}]
 ...}

The downsides I see with this are:

1) Schemas potentially lose their homoiconicity. Taking the value of a schema written as string? and seralizing it would yield :string. Still, this is a problem of any schema that makes use of the registry, so I'm not sure it's that big of an issue.

2) Some predicate-schema based types don't have analogs in typed schemas. Eg. double?. I'd be eager to hear ideas on how best to resolve this.

ikitommi commented 3 years ago

Thinking aloud, either:

the forms should not change from the original input, but string? could be mapped to a m/-string-schema, which could take the :type as an argument so that they would map to the same impl, just with different :type
after #269, we could optionally issue a warning, when a schema uses simple keyword proprties, that are not part of the system. Would require a way to register modules into schema system, e.g. when using JSON Schema, :json-schema is a valid key. Not sure how messy this would be. Reitit uses this, with good success.
deprecate the predicate-schemas with malli 1.0.0

The predicate-schemas are currently listed using m/predicate-schemas and it's easy to drop them from a custom registry, e.g.:

(defn default-schemas []
  (merge (class-schemas) (comparator-schemas) (type-schemas) (base-schemas)))

It would be easy to implement all the core predicates as new/existing m/type-schmas.

rschmukler commented 3 years ago

I personally lean toward using a shared implementation with a potentially different :type. I think the predicate schemas are worth keeping as I think there's some appeal to them for people coming from spec, and personally

Number 2. above sounds like a useful enhancement but potentially tangential to this issue.

rschmukler commented 3 years ago

Upon further thought, while the solution proposed in number 1 solves the issue for malli, it still makes transformer authors have to register the duplicate keys (eg. 'int?, :int and 'integer?). It may still be the best option, I leave it to you.

nilern commented 3 years ago

Maybe transformers (and JSON schemas etc.) should be mapped to the schema class (or even instance?) instead of the name? Compilers (and Datomic) get rid of names as soon as possible to eliminate aliasing (like here) and scoping (like #326) concerns.

ikitommi commented 3 years ago

Good points.

That said, type definitions should be portable between clj & cljs, right? (related: #264). There was an online discussion about having a custom map Schema in user space, that would work like a map with JSON Schema, Swagger and Value generation => "just add a :type tag into schema and done"?

I think the core predicates could be rewritten using a new proxy-schema (just new options into m/-simple-schema) that just rewrites the m/type into something else, e.g.

{'pos-int? (m/-simple-schema {:form 'pos-int?, :schema [:int {:min 1}]})
 pos-int? (m/-simple-schema {:form 'pos-int?, :schema [:int {:min 1}]})}

as validators, explainers etc. use the factual predicate, there would not be any performance penalty from this extra abstraction. And :int could have type-property :type :int so that all would derive the same :type.

EDIT: ... and the transformers could be written against just the :type (in addition to the m/type. Too many type-names :)

metosin / malli

Consider Merging Predicate and Type Schemas #327