stathissideris / spec-provider

Infer Clojure specs from sample data. Inspired by F#'s type providers.
519 stars 22 forks source link

NullPointerException in merge-stats #7

Closed alephyud closed 7 years ago

alephyud commented 7 years ago

Great library!

When I started using it with my data, I ran into the following problem: when I try to run infer-specs on structures like this:

[{:a {:b [1 2]}} {:b [1]}]

I am getting an exception of the following kind:

1. Unhandled java.lang.NullPointerException
   (No message)
                 merge.clj:   23  spec-provider.merge/merge-with-fns/fn
             protocols.clj:   49  clojure.core.protocols/iter-reduce
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   13  clojure.core.protocols/fn/G
                  core.clj: 6704  clojure.core/reduce
                  core.clj: 6686  clojure.core/reduce
                 merge.clj:   19  spec-provider.merge/merge-with-fns
                 merge.clj:   15  spec-provider.merge/merge-with-fns
                 merge.clj:   69  spec-provider.merge/merge-stats
                 merge.clj:   65  spec-provider.merge/merge-stats
              provider.clj:  145  spec-provider.provider/summarize-stats/fn/fn
              (...)

instead of the expected

(s/def ::b integer?)
(s/def ::a (s/keys :req-un [::b]))
(s/def ::sample (s/keys :req-un [::a ::b]))
stathissideris commented 7 years ago

Thanks for the report, I'll look into it!

stathissideris commented 7 years ago

Fixed in version 0.4.1

alephyud commented 7 years ago

Thank you very much for the quick correction!

However, I am still getting an error on samples like [{:a {:b [{} {}]}} {:b [{}]}] (and there are several more levels of nesting in my actual data).

stathissideris commented 7 years ago

OK, reopening!

stathissideris commented 7 years ago

Please try version 0.4.2, this fixes the latest example that you gave me, but it's a tricky bit of code so I'm leaving this open in case you find more problematic cases. Thanks!

alephyud commented 7 years ago

Thank you very much,

Now it breaks on [{:a {:b [[] []]}} {:b [[]]}] (in the actual data, each :b is an array of heterogeneous arrays, like: [[:fork-variant-simple :default #{:modern} nil {:pronominal-affixes :normal, :scope #{:sc}}] [:fork-variant-simple :default #{:short-3p :rare} nil {:pronominal-affixes :kamatz, :scope #{:P--3mp :P--3fp}}]] - it would be helpful the spec provider could generate at least some broad spec that I could then further refine by hand).

stathissideris commented 7 years ago

OK thanks, I'll look into that one too. I think the merging code really needs some generative testing, unit tests won't cut it for such a complex case! I'll get back to you.

stathissideris commented 7 years ago

It should be better now, version 0.4.3 -- includes some new features as well.

stathissideris commented 7 years ago

Just run it on your actual data from above, it seems to freak out about :scope produces a spec that looks like (s/or). I'll look into that too! Seems like sets are broken, raised as issue #8

stathissideris commented 7 years ago

For best results, run again with 0.4.4. Waiting for your confirmation before closing this again.

alephyud commented 7 years ago

Now with 0.4.4 I get an error on sets, like: [{:a {:b #{"string 1" "string 2"}}} {:b #{"string 3"}}].

Sorry for only reporting errors and not contributing code - I promise I'll look into the library when I get a little bit more spare time...

stathissideris commented 7 years ago

Not at all, you're already helping me a lot by reporting all the problems, thanks for your patience.

Another day another version, please give 0.4.5 a try. Thanks!

alephyud commented 7 years ago

Wow, that works much better!

However, on samples like [{:a #{1}} {:a nil}] it generates (s/def ::a (s/nilable (s/coll-of keyword?) :kind set?)), which is not a valid spec, at least in 1.9.0-alpha16 that I am using - I would expect (s/def ::a (s/nilable (s/coll-of keyword? :kind set?))).

stathissideris commented 7 years ago

Glad that it works. The case that you found is yet another bug :-) I really need to property-test this!

stathissideris commented 7 years ago

OK, please try 0.4.7

alephyud commented 7 years ago

Thank you!

In some cases, the sample does not satisfy the generated specs: First, a key is sometimes marked as required even if does not occur in all entries, for example, the result for [{} {} {} {:a true} {:a true}] is (s/keys :req-un [::a]). Second, the result for (take 1000 (cycle [:a :b :c nil])) is #{nil :c :b :a} - I would use contains? to handle null values correctly.

stathissideris commented 7 years ago

OK, thanks for this, but since these are now very different to the original issue here, I've raised the first as a new issue. As for the second thing you raise, wouldn't (s/nilable #{:a :b :c}) be a more readable spec?

alephyud commented 7 years ago

Agree, (s/nilable #{:a :b :c}) looks better (but still need to handle false values separately).