metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.46k stars 208 forks source link

fix: -regex-min-max for :schema inside :cat #851

Closed opqdonut closed 1 year ago

opqdonut commented 1 year ago

Fixes part 2 of #839

opqdonut commented 1 year ago

This is the minimal fix to make regex-min-max work properly on [:cat [:schema [:cat ...]]].

A cleaner solution might be to use a special-purpose marker schema for this instead of the generic :schema. That would break backwards compatibility, unfortunately.

opqdonut commented 1 year ago

In master:

user=> (m/-regex-min-max (m/schema [:schema [:cat int? int?]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [::m/schema [:cat int? int?]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [:cat int? [:schema [:cat int? int?]]]))
{:min 3, :max 3}
user=> (m/-regex-min-max (m/schema [:cat int? [::m/schema [:cat int? int?]]]))
{:min 3, :max 3}

In this PR:

user=> (m/-regex-min-max (m/schema [:schema [:cat int? int?]]))
{:min 1, :max 1}  ;; this changing is an unfortunate side-effect
user=> (m/-regex-min-max (m/schema [::m/schema [:cat int? int?]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [:cat int? [:schema [:cat int? int?]]]))
{:min 2, :max 2}  ;; this changing is the actual fix
user=> (m/-regex-min-max (m/schema [:cat int? [::m/schema [:cat int? int?]]]))
{:min 3, :max 3}
opqdonut commented 1 year ago

Just for completeness, this PR doesn't change the validation behaviour:

user=> (m/validate (m/schema [:schema [:cat int? int?]]) [1 2])
true
user=> (m/validate (m/schema [::m/schema [:cat int? int?]]) [1 2])
true
user=> (m/validate (m/schema [:cat int? [:schema [:cat int? int?]]]) [1 2 3])
false
user=> (m/validate (m/schema [:cat int? [:schema [:cat int? int?]]]) [1 [2 3]])
true
user=> (m/validate (m/schema [:cat int? [::m/schema [:cat int? int?]]]) [1 2 3])
true
user=> (m/validate (m/schema [:cat int? [::m/schema [:cat int? int?]]]) [1 [2 3]])
false
opqdonut commented 1 year ago

Yeah my original fix returns {:min 2 :max 2} for [:schema [:cat int? int?]]. The point of -re-cat-min-max was to handle :schema inside :cat differently from :schema outside :cat. There's no way to do a "local" fix that would have :schema behave differently based on context.

The fragments worked with my previous fix as well:

user=> (m/-regex-min-max (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? ::cat]))
{:min 3, :max 3}
user=> (m/validate (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? ::cat]) [1 "2" 3])
true
user=> (m/-regex-min-max (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [:schema ::cat]]))
{:min 2, :max 2}
user=> (m/validate (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [:schema ::cat]]) [1 "2" 3])
false
user=> (m/validate (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [:schema ::cat]]) [1 ["2" 3]])
true
user=> (m/-regex-min-max (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [::m/schema ::cat]]))
{:min 3, :max 3}
user=> (m/validate (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [::m/schema ::cat]]) [1 "2" 3])
true
user=> (m/validate (m/schema [:cat {:registry {::cat [:cat string? int?]}} int? [::m/schema ::cat]]) [1 ["2" 3]])
false
opqdonut commented 1 year ago

So is there a reason not to merge my original fix? It seems to have the behaviour we agree on. I can add test cases to document these decisions as well.

opqdonut commented 1 year ago

Oh now I see, with my previous fix :schema doesn't work under other seqex modifiers:

(m/-regex-min-max (m/schema [:+ [:schema [:cat int? int?]]]))
 ==> {:min 2}   ;; should be {:min 1}

I'll need to go back to the hammock for a while.

ikitommi commented 1 year ago

hammock time :+1:. maybe the caller of the RegexSchema methods should tell the callee that "you are under a sequence schema", e.g. extra parameter into the methods. This way, the responsibility to work correctly is with the the Schemas.

I somehow think the process of composing sequence schemas should be done differently and it would remove all of these weird issues.

opqdonut commented 1 year ago

One way would be to only allow sequence schemas inside a special :sequence schema. The :sequence schema would also be used to mark nested sequences:

(m/schema [:cat int?])
  ==> error
(m/validate (m/schema [:sequence [:cat int? [:sequence [:cat int? int?]]]]) [1 [2 3]])
  ==> true   ;; :sequence marks a sub-sequence
(m/validate (m/schema [:sequence [:cat int? [:schema [:cat int? int?]]]]) [1 2 3])
  ==> true   ;; :schema does nothing

To make it a bit more ergonomic, we could have an implicit :cat inside :sequence:

[:sequence [:cat a b c]]  ===  [:sequence a b c]
opqdonut commented 1 year ago

Figured out a way to make it work.

user=> (m/-regex-min-max (m/schema [:schema [:cat int? int?]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [::m/schema [:cat int? int?]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [:cat int? [:schema [:cat int? int?]]]))
{:min 2, :max 2}
user=> (m/-regex-min-max (m/schema [:cat int? [::m/schema [:cat int? int?]]]))
{:min 3, :max 3}
ikitommi commented 1 year ago

Looks good. Do we need the 2 arities? It's extender api, so we can break it just to have the arity-2 version.

opqdonut commented 1 year ago

Good point. I'll remove the other arity.