plumatic / schema

Clojure(Script) library for declarative data description and validation
Other
2.41k stars 257 forks source link

`schema.coerce/coercer` should preserve uncoerced originals in errors. #77

Open startling opened 10 years ago

startling commented 10 years ago

We're using schema to validate Yaml; specifically, we annotate our Yaml data with source positions and then use coercions to validate the un-annotated data. We'd like to display nice error messages using the source positions, but schema.coerce/coercer doesn't preserve the original uncoerced data in the error it throws.

I've written an alternative coercer that does this, but I'm not convinced that it's the best way to do it. What do you all think?

;; A unique type we'll wrap ordinary schema errors in in order to provide
;; the original annotated-with-yaml-positions value.
(deftype CoercionError [original error])

(defn annotate-error
  "Turn an error in an ErrorContainer into a CoercionError containing it
   and the original uncoerced version of the value in question. Preserves
   the value of all other types."
  [original result]
  (if (u/error? result)
    (let [inner (:error result)]
      (u/error
       (cond
        ;; If we get an ordinary Validation error (e.g. 1 is not s/Str),
        ;; just wrap it in a CoercionError.
        (instance? schema.utils.ValidationError inner)
        (CoercionError. original inner)
        ;; If we get a map, it may contain e.g. [:x 'disallowed-key];
        ;; so wrap all symbols in a CoercionError.
        ;; N.B. -- original here is the original map, *not* the original key.
        (map? inner)
        (into {}
              (for [[k v] inner]
                [k (if (symbol? v)
                     (CoercionError. original v)
                     v)]))
        ;; Otherwise we don't care about wrapping it.
        :default inner)))
    result))

(defn coercer
  "An alternative to `schema.coerce/coercer` that preserves uncoerced originals."
  [schema coercion-matcher]
  (s/start-walker
   (fn [s]
     (let [walker (s/walker s)]
       (if-let [coercer (coercion-matcher s)]
         (fn [x]
           (annotate-error x
            (m/try-catchall
             (let [v (coercer x)]
               (if (u/error? v)
                 v
                 (walker v)))
             (catch t (m/validation-error s x t)))))
         walker)))
   schema))

(This is a little less-than-perfect, especially since for e.g. 'disallowed-key we get the original uncoerced map rather than the original key. In general coercing keys seems like a hard problem; I'd love advice on this).

w01fe commented 10 years ago

Apologies for the delay, we've got a ton going on at Prismatic last and next weeks.

The end goal makes sense to me, but I don't think you want to start replicating logic from Map and other schemas inside the coercer.

One plan of attack I might consider is to use the existing coercer, but just ensure that your coercion fns don't coerce things that won't match the final schema -- which might be preferable, if it's not much more complicated. But it's hard to say without knowing more about your use case -- I don't think we've run into cases where this was an issue yet internally.

If you can share some more specifics about your use case and what's going wrong, I'm happy to look into it further.

startling commented 10 years ago

No problem! Thanks for taking the time to reply.

I don't think making a coercion function that doesn't coerce things that won't match the schema makes sense -- I think it'd require doing a lot of validation in the coercion function that doesn't really make sense.

After reading your response I realized I'm conflating two issues: I'd like to coerce keys and I'd like to preserve uncoerced originals in errors. I think I've solved both problems. Here's what I've got:

(ns schema-example.core
  (:require [schema.core :as s]
            [schema.coerce :as c]
            [schema.utils :as u]
            [schema.macros :as m]
            [clj-yaml.core :as yaml]
            [clojure.pprint :refer [pprint]]))

;; We're getting data marked with source positions from Yaml.
(pprint (yaml/parse-string "{a: b}" :mark true :keywords false))
;; {:start {:line 0, :index 0, :column 0},
;;  :end {:line 0, :index 6, :column 6},
;;  :unmark
;;  {{:start {:line 0, :index 1, :column 1},
;;    :end {:line 0, :index 2, :column 2},
;;    :unmark "a"}
;;   {:start {:line 0, :index 4, :column 4},
;;    :end {:line 0, :index 5, :column 5},
;;    :unmark "b"}}}
;;
;; Note that both the keys and the values are marked with source positions,
;; so I've written a coercion function coerce-keys that handles this.

;; A unique type we'll wrap ordinary schema errors in in order to provide
;; the original annotated-with-yaml-positions value.
(deftype CoercionError [original error])

(defn coercion-error? [e]
  (instance? CoercionError e))

(defn coerce-keys
  "Given a function that returns all the possible coercions for a key,
   coerce the keys in a map to fit in a given schema using the first
   such valid coercion."
  [schema coerce-key]
  (letfn [(actually-map? [v]
             (or (instance? clojure.lang.PersistentHashMap v)
                 (instance? clojure.lang.PersistentArrayMap v)))]
    (when (and (actually-map? schema)
               (some s/specific-key? (keys schema)))
      ;; Find out the keys for this schema
      (let [expected-key? (->> schema keys (map s/explicit-schema-key) set)]
        ;; And return a function that tries to fit the keys of a map into there.
        (fn [data]
          (if-not (actually-map? data)
            ;; If this is a map schema but this value isn't a map, preserve the
            ;; original and let the schema fail.
            data
            ;; If it *is* actually a map, try to coerce all the keys.
            (let [keys-coerced
                  (for [[k v] data]
                    (let [matches (->> k coerce-key (filter expected-key?))]
                      ;; If none of the key coercions apply, put a CoercionError
                      ;; wrapping the uncoerced original.
                      (if-not (empty? matches)
                        [(first matches) v]
                        [(-> k coerce-key first)
                         (CoercionError. k 'disallowed-key)])))
                  ;; Find the piece of `keys-coered` with CoercionErrors.
                  errors (filter (comp coercion-error? second) keys-coerced)]
              ;; If there are any CoercionErrors, return them in a schema error.
              (if-not (empty? errors)
                (u/error (into {} errors))
                (into {} keys-coerced)))))))))

(def key-coercions
  "A set of coercions we'll apply to each key."
  (comp
   (juxt identity c/string->keyword)
   yaml/unmark))

(defn coercions
  "A set of coercions we'll apply to each value."
  [schema]
  (comp
   (or (coerce-keys schema key-coercions)
       identity)
   yaml/unmark))

(def ExampleSchema
  {(s/optional-key "a")
   s/Str})

;; Cool.
(pprint
 ((c/coercer ExampleSchema coercions)
  (yaml/parse-string "{a: b}" :mark true :keywords false)))
;; => {"a" "b"}

;; But, original values (particularly source positions) aren't preserved !
(pprint
 ((c/coercer ExampleSchema coercions)
  (yaml/parse-string "{a: 1}" :mark true :keywords false)))
;; => {:error {"a" (not (instance? java.lang.String 1))}}

;; So, I wrote an alternative to coercer...

(defn annotate-error
  "Turn an error in an ErrorContainer into a CoercionError containing it
   and the original uncoerced version of the value in question. Preserves
   the value of all other types."
  ;; N.B. this preserves the wrong thing if the first branch of an s/Both fails.
  [original result]
  (if (u/error? result)
    (let [inner (:error result)]
      (u/error
       (cond
        ;; If we get an ordinary Validation error (e.g. 1 is not s/Str),
        ;; just wrap it in a CoercionError.
        (instance? schema.utils.ValidationError inner)
        (CoercionError. original inner)
        ;; Otherwise we don't care about wrapping it.
        :default inner)))
    result))

(defn coercer
  "An alternative to `schema.coerce/coercer` that preserves uncoerced originals."
  [schema coercion-matcher]
  (s/start-walker
   (fn [s]
     (let [walker (s/walker s)]
       (if-let [coercer (coercion-matcher s)]
         (fn [x]
           (annotate-error x
            (m/try-catchall
             (let [v (coercer x)]
               (if (u/error? v)
                 v
                 (walker v)))
             (catch t (m/validation-error s x t)))))
         walker)))
   schema))

;; Perfect!
(-> "{a: 1}"
    (yaml/parse-string :mark true :keywords false)
    ((coercer ExampleSchema coercions))
    :error
    (get "a")
    .-original
    pprint)
;; => {:start {:line 0, :index 4, :column 4},
;;     :end {:line 0, :index 5, :column 5},
;;     :unmark 1}

What would you suggest? I understand that my use-case may be a little arcane; would a PR extending coercer like this be welcome?

startling commented 10 years ago

(I forgot to mention: that code breaks on s/either and s/maybe, and I'm not even sure how to debug it:

(def OtherSchema
  {(s/optional-key "a")
   (s/maybe s/Str)})

(-> "{a: 1}"
    (yaml/parse-string :mark true :keywords false)
    ((coercer OtherSchema coercions))
    :error
    (get "a")
    .-original
    pprint)
;; => 1
;; :(

)

w01fe commented 10 years ago

Looks reasonable to me as a solution to your problem. I'm not sure about changing the default coercer though, will have to give this some more.

As for either and maybe, I think you're applying the unmarking at every stage, which isn't necessarily what you want. So by the time the data has passed through the either/maybe, it's already unmarked.

One solution would be to let your annotate-error stick on the original data whenever it is present, rather than when you've found the leaf error -- that way the original info will get applied outside at the maybe level?

On Wed, Apr 9, 2014 at 7:49 PM, startling notifications@github.com wrote:

(I forgot to mention: that code breaks on s/either and s/maybe, and I'm not even sure how to debug it:

(def OtherSchema {(s/optional-key "a") (s/maybe s/Str)})

(-> "{a: 1}" (yaml/parse-string :mark true :keywords false)

((coercer OtherSchema coercions))

:error
(get "a")
.-original
pprint)

;; => 1;; :(

— Reply to this email directly or view it on GitHubhttps://github.com/Prismatic/schema/issues/77#issuecomment-40038111 .