plumatic / schema

Clojure(Script) library for declarative data description and validation
Other
2.41k stars 256 forks source link

Validation in ClojureScript extremely slow #356

Open ku1ik opened 8 years ago

ku1ik commented 8 years ago

I have defined schema'd record (here https://github.com/asciinema/asciinema-player/blob/19c5c423e3ca78df1ae04dcebbf0d32437e303e1/src/asciinema/player/vt.cljc#L56), used s/defn to few functions and enabled schema validation in test via (use-fixtures :once schema.test/validate-schemas) (here https://github.com/asciinema/asciinema-player/blob/19c5c423e3ca78df1ae04dcebbf0d32437e303e1/test/asciinema/player/vt_test.cljc#L15).

I have noticed that schema validation is very slow when under ClojureScript. Without validation the tests under both Clojure and ClojureScript were finishing in few seconds. After enabling the validation I noticed that it's still couple of seconds for Clojure but 15+ minutes for ClojureScript (phantomjs). Right now I'm enabling test validation for Clojure only because of this, using reader conditional (as seen in the second link above).

It seems it's not test-specific problem, because I can observe the same behavior when using (s/set-fn-validation! true).

I tried to find a specific part of this VT schema that makes it so slow (changing different sub-schemas to s/Any etc) and it seems there's no single key/sub-schema that makes it slow, it just adds up from all sub-schemas.

I'd be happy to give more information, test, dig deeper. Any hints?

w01fe commented 8 years ago

Thanks for the report. Is this possibly the same issue as #352 ? Are you observing failures, or is it slow despite everything succeeding?

I don't know much about ClojureScript profiling, but I'll try to take a look and see what I can find. If you can share a slow js file that I can profile in Chrome, that would be a big help. Thanks!

ku1ik commented 8 years ago

It's always succeeding so not sure if the underlying issue is related to the mentioned one by you. I'll prepare slow .js file for Chrome, hold on.

ku1ik commented 8 years ago

I pushed compiled files with simple html file to this repo: https://github.com/sickill/gh-356 It has been compiled without advanced optimization (so it's easier to see function names in profiler), with (s/set-fn-validation! true).

ku1ik commented 8 years ago

I have added CPU profile dump to the repo, you can load it in Chrome profiler. Lot's of schema.spec.element_transformer and schema.spec.collection.CollectionSpec.schema$spec$core$CoreSpec$checker$arity$2 calls. Stack is very deep there, and it's hard for me to see where it actually ends.

ku1ik commented 8 years ago

If you switch to "Heavy (Bottom Up)" view in the profiler it's easier to spot. (btw, sorry for the noise ;-)

w01fe commented 8 years ago

Thanks, that's very helpful!

Now that I'm looking at the code, it seems like you have a large state object that is being validated many times (included in generative tests)? Out of the box, Schema will validate the entire VT object every time you call the feed-one function, including walking all of the buffers and such (which, given the representation as a sequence of pairs, could be pretty expensive). There is no caching for parts that have already been validated, so the full validation is redone every time the function is called (even if only a small part of the state has changed).

I'm not sure if this is the only thing going on or if something else is unexpectedly slow (especially given the comparative results clj vs cljs), but nothing pops out in the profile to me. Did you notice anything?

If there is a way to improve the performance of cljs checking, we'd definitely like to do that. In the mean-time, as a workaround you could declare feed-one to be ^:never-validate, so that you get schema checking on the other functions but don't revalidate the entire state on every step. You could also create a custom checker that caches results -- something like a map from schema to IdentityHashMap with the set of known-good inputs. If you're interested in going down that road, I would be happy to give some pointers.

On Thu, Apr 28, 2016 at 12:53 AM, Marcin Kulik notifications@github.com wrote:

I pushed compiled files with simple html file to this repo: https://github.com/sickill/gh-356 It has been compiled without advanced optimization (so it's easier to see function names in profiler), with (s/set-fn-validation! true).

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/plumatic/schema/issues/356#issuecomment-215170167

ku1ik commented 8 years ago

Yes, it's validated many times, including generative tests. Generative tests are the major cause of 15+ minutes test run because of the high number of times they stress feed-one + schema validation. However when I disabled generative test, left only the regular unit ones, it was also an order of magnitude slower (under CLJS) than without schema validation. So instead of 2 seconds they were running for 20 seconds.

I didn't observe anything I could stick to in the profile (mostly because I don't know schema's internals). Re the size of the object I'm validating, yeah, it's not the smallest, so I suspected the size plays role here, however I didn't expect such a difference (between CLJS without validation and CLJS with validation, also between CLJS and CLJ with validation). Apparently this code runs orders of magnitude faster on CLJ, I just didn't notice it when it was all in a small scale.

Re adding ^:never-validate to feed-one: this is a place where I actually want to have the schema checked in tests, especially with generative tests which put lots of rubbish (random bytes) onto this virtual terminal, and I want to find about any violations of schema resulting from mishandling edge cases.

I appreciate you dug deeper, looked at my code ❤️. I think I'll stay with the reader conditional enabling schema checking in tests only under CLJ, which is fine as I'm running tests under both CLJ and CLJS on CI so potential bugs will be caught anyway.

w01fe commented 8 years ago

Ah, thanks for the context -- that all makes sense, and sounds like a reasonable compromise. I haven't done much with performance in ClojureScript, but I don't know of any specific reason it would be slower -- maybe the JIT just isn't as good at optimizing the work as its set up as does as the JVM (the specific way checkers are created is aimed at high performance after aggressive inlining, FWIW).

So I guess I'll put this on the back burner for now, but I'll definitely keep it in the back of my mind next time I'm digging into things.

ku1ik commented 8 years ago

Ok, thanks! (feel free to close this or keep it open, whichever you prefer).