metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.48k stars 210 forks source link

Validation happening during decoding when using :or #910

Closed irigarae closed 1 year ago

irigarae commented 1 year ago

This is just a question to wrap my head at what's the logic behind this behaviour. I don't know if it's exclusive to :or but maybe after understanding it it all makes sense. It caught me by surprise because I read here: since m/decode doesn't check the schema, you need to run m/validate (or m/explain) if you want to make sure your data conforms to your schema.

Basically I think I'm seeing some sort of validation inside an :or and the decoding rolls back if it sees the decoding didn't provide valid values. This doesn't happen with basic types like :string :int, even with :and I couldn't find the common pattern.

It looks like one of the :or branches transforms the value and it matches the schema, then that one is kept, otherwise all transformations (even if the functions ran) are ignored.

All of the examples above except the one I marked makes sense to me. The main thing that escapes my head is the situation when none of the options in :or match the target. How is it decided to rollback the changes? Why not keep for example the first decoding on the :or if the source didn't match anything?

Here's a comment from the code before. So I guess at some point in time it was intentional to take the first validated. But I'm not sure if that's in line with the current philosophy about decoding/encoding. It is even encoded in the tests.

Question

Is this expected behaviour? If so, what's the reasoning? If not, a bug I guess? Does it happen with more predicates/types of validations?

Full example

So here's complete exhaustive example:

(require '[malli.core :as m]
         '[malli.transform :as mt])

(for [schema [:int :string
              [:and :int] [:and :string]
              [:and :int :string] [:and :string :int]
              [:or :int] [:or :string]
              [:or :int :string] [:or :string :int]]]
  (m/decode
   schema
   :x
   (mt/transformer
    {:decoders
     {:int (constantly "1")
      :string (constantly 1)}})))
#_=> ("1" 1 "1" 1 1 "1" :x :x :x :x)

(for [schema [:int :string
              [:and :int] [:and :string]
              [:and :int :string] [:and :string :int]
              [:or :int] [:or :string]
              [:or :int :string] [:or :string :int]]]
  (m/decode
   schema
   :x
   (mt/transformer
    {:decoders
     {:int (constantly 2)
      :string (constantly 1)}})))
#_=> (2 1 2 1 1 2 2 :x 2 2)

(for [schema [:int :string
              [:and :int] [:and :string]
              [:and :int :string] [:and :string :int]
              [:or :int] [:or :string]
              [:or :int :string] [:or :string :int]]]
  (m/decode
   schema
   :x
   (mt/transformer
    {:decoders
     {:int (constantly 1)
      :string (constantly "1")}})))
#_=> (1 "1" 1 "1" "1" 1 1 "1" 1 "1")
ikitommi commented 1 year ago

I see. Here's one repro:

(m/decode [:int {:decode/math str}] 1 (mt/transformer {:name :math})) ; => "1"
(m/decode [:and [:int {:decode/math str}]] 1 (mt/transformer {:name :math})) ; => "1"
(m/decode [:or [:int {:decode/math str}]] 1 (mt/transformer {:name :math})) ; => 1

so, options being:

  1. do nothing: it's a feature
  2. use first branch in case none is valid after decode
  3. use first branch where value is changed in decode

Not sure what is the most correct, 1 or 2 here 🤔

irigarae commented 1 year ago

Just to clarify, I was trying to understand what’s the logic behind this design. In case there was some. I also saw that for and/or multiple transformations run after each other, hence why I might be missing the bigger picture.

Otherwise fine if it’s a feature, but at least documentation should be a bit explicit. The current design allows one to be a bit sloppy and do multiple decodings in general, however having a :or automatically makes you need to be precise (something has to be valid after that specific decoding step).

Anyway haven’t thought thoroughly but I think its less of a surprise if something decodes/encodes the value it’s kept (similar when using :int or :and), but no strong opinion. Fine to let it be until strong logic is found to decide how it should behave.