metosin / malli

High-performance data-driven data specification library for Clojure/Script.
Eclipse Public License 2.0
1.5k stars 211 forks source link

Fix recursive `:+` generators #792

Closed frenchy64 closed 1 year ago

frenchy64 commented 1 year ago

:+ generators with recursive variables can become [:+ :UNREACHABLE], which corresponds to the impossible generator (gen/not-empty (gen/return ())). In this case, it should reduce to (-never-gen).

Added a useful debugging namespace that generates AST's for generators, which is how I found the root problem. Because of the way it's generated, I had to qualify some keywords using an alias instead of auto-namespacing.


Original report: https://clojurians.slack.com/archives/CLDK6MFMK/p1670138741691259

Carlo:
A question about recursive generators and :+. Consider this recursive spec that describes a boolean formula:
(def formula
  [:schema
   {:registry
    {::formula
     [:or
      :boolean
      [:tuple [:enum :not] :boolean]
      [:tuple [:enum :and] [:* [:ref ::formula]]]
      [:tuple [:enum :or]  [:* [:ref ::formula]]]]}}
   [:ref ::formula]])
We can of course generate examples via:
(repeatedly 20 #(mg/generate formula))
But, if I change :* with :+ (rest in :thread:)

I now have:
(def formula
  [:schema
   {:registry
    {::formula
     [:or
      :boolean
      [:tuple [:enum :not] :boolean]
      [:tuple [:enum :and] [:+ [:ref ::formula]]]
      [:tuple [:enum :or]  [:+ [:ref ::formula]]]]}}
   [:ref ::formula]])

(repeatedly 20 #(mg/generate formula))
and the generator for this seems broken to me (for starters, it usually doesn't return 20 things. I suspect the reason is that it makes more difficult to generate values (as a naive sampling terminates more rarely). Is that correct?

By a brute force search, I saw that this problem is solved if I wrap :+ into a :schema. A more thorough explanation would be useful though
ikitommi commented 1 year ago

Thanks! The generator-debug is really nice. It would be useful as part of test.check itself? Or as a standalone library? It definitely can be part of malli, but I think people would find it useful in a larger context too.

Anyway, big thanks for resolving this.

frenchy64 commented 1 year ago

@ikitommi agreed this seems more broadly useful. Since generators are records, perhaps a library can be made that even makes hybrid "inspectable" generators---that just assoc extra information to generators that looks like an AST.