metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.51k stars 212 forks source link

Metaschema / schema validation? #872

Closed ieure closed 1 year ago

ieure commented 1 year ago

Does Malli have a metaschema, that is, a schema for its schema syntax(es)?

I ask because it's very easy to write an invalid schema, but difficult to determine whether one is invalid, and why. I've run into this problem repeatedly, and find the timing (when you're writing tests, if you're lucky) and debugging frustrating. When the issue arises, the error is unhelpful; I get this, and a stack trace:

1. Unhandled clojure.lang.ExceptionInfo
   :malli.core/invalid-schema
   {:type :malli.core/invalid-schema,
    :message :malli.core/invalid-schema,
    :data {:schema :auto}}

Which tells me that the schema is invalid, but not why it's invalid. Determining that typically means bisecting the schema to isolate the problem, then debugging it — and possibly repeating if there are multiple issues. This feels very silly for a tool whose whole point is to see if some arbitrary value conforms to expectations.

Is there a schema for the schema syntax, such that I can validate a Malli schema against it, and have it tell me precisely why it's invalid (if it is)?

ikitommi commented 1 year ago

Malli support metaschemas - schemas for both properties and children:https://github.com/metosin/malli/blob/9a0854632dadd27b89f217d907757bd4a8478aed/src/malli/core.cljc#L26-L27 but currenly no schemas have defined those: (https://github.com/metosin/malli/blob/9a0854632dadd27b89f217d907757bd4a8478aed/src/malli/core.cljc#L969-L970

ieure commented 1 year ago

I'm not super familiar with the Malli internals, but I don't think that's what I'm looking for, but I'd be delighted to be wrong. The into-schema call is what blows up, so I never have a value satisfying IntoSchema to call those methods on. They appear to be a type of reflective access for the schema, rather than a schema validation mechanism.

Hopefully this illustrates what I'm talking about better. If I have this schema:

(def some-schema [:map [:foo #(instance? java.lang.Throwable %)]])

This is not valid, because arbitrary predicates need to be wrapped in [:fn …][1], so attempting to use it…

(malli.core/validate some-schema (ex-info "oops" {}))

…blows up with an unhelpful exception:

1. Unhandled clojure.lang.ExceptionInfo
   :malli.core/invalid-schema
   {:type :malli.core/invalid-schema,
    :message :malli.core/invalid-schema,
    :data {:schema #function[user/fn--50567]}}

What I want is for Malli to have a schema for its own schema syntax, similar to how there's an XML schema definition of a valid XML schema. A schema of the schema syntax; a metaschema, which is used to validate Malli schemas. This would let me do things like:

(when-not (malli.core/validate malli.meta/schema some-schema)
  (malli.core/explain malli.meta/schema some-schema)

And get a detailed explanation of why the schema is invalid:

{:schema
 [:tuple [:enum :map] [:sequential [:tuple [:enum :fn] [:fn #function[clojure.core/fn?]]]]],
 :value [:map [:foo #function[user/fn--50567]]],
 :errors
 ({:path [1 0],
   :in [1 0],
   :schema [:tuple [:enum :fn] [:fn #function[clojure.core/fn?]]],
   :value :foo,
   :type :malli.core/invalid-type}
  {:path [1 0],
   :in [1 1],
   :schema [:tuple [:enum :fn] [:fn #function[clojure.core/fn?]]],
   :value #function[user/fn--50567],
   :type :malli.core/invalid-type})}

Which (more or less) clearly shows the actual problem: the value has an unexpected type, it should have been [:fn fn?], but was a function. This is the output of a very half-assed metaschema I wrote; a proper one would need recursion, therefore a registry.

[1]: Side note, I dislike the inconsistency in predicate handling. Certain built-in predicates like string? work fine as bare values, but others require [:fn pred?] syntax.

ikitommi commented 1 year ago

That is what you are looking for - each IntoSchema can define in malli syntax what is the valid form of the properties and children it expects and validate those before creating the schema instance. Given a syntax:

[:int {:min "123"} "illegal"]

... you can fetch the IntoSchema for :int from the registry and ask what properties and children it allows. For :int, they would be:

;; properties, open map with support for :min and :max 
[:map [:min :int] [:max :int]]

;; children (no children)
:nil 

For the side note, many core functions are mapped as schemas to make life of Spec-users more familiar. The out-of-the-box supported predicates are listed in both README and in the source code. https://github.com/metosin/malli#mallicorepredicate-schemas

ieure commented 1 year ago

How do I actually validate that in such a way to get the output I want? I'm really struggling due to the lack of documentation on how to use this (for example, -children-schema takes a required options argument, but I see nowhere that even defines the type of the argument, much less what values are expected).

Using my example schema from before:

(def sample-schema [:map [:foo #(instance? java.lang.Throwable %)]])

I can get the IntoSchema for :map, though this is very manual:

user> (-> malli.core/default-registry (malli.registry/-schema (first sample-schema)))
#IntoSchema{:type :map}

I know that :map accepts properties like {:closed true}, because that's in the README, but -property-schema claims that none are accepted:

user> (-> malli.core/default-registry (malli.registry/-schema :map) (malli.core/-properties-schema nil))
nil
user>

Maybe I have to pass options? Again, I have no idea what should go there. Similarly, it gives me nil for -children-schema:

user> (-> malli.core/default-registry (malli.registry/-schema :map) (malli.core/-children-schema nil))
nil
user> 

It should not be this hard to answer the questions "Is this Malli schema valid?" and "If not, what is wrong with it?"

NoahTheDuke commented 1 year ago

It should not be this hard to answer the questions "Is this Malli schema valid?" and "If not, what is wrong with it?"

I'm not affiliated with Metosin, just a curious watcher, but I think this is a little aggressive.

ikitommi commented 1 year ago
  1. there are protocol methods defined to start defining the schema metaschemas #414
  2. none of the built-in schemas have metaschemas defined #269 #270
  3. there is no public API to check schema metaschemas correctness based on the previous points #18

so, not there yet. Closing as answered.