dry-rb / dry-schema

Coercion and validation for data structures
https://dry-rb.org/gems/dry-schema
MIT License
415 stars 108 forks source link

Make KeyValidator faster by sorting keys and using binary search algorithm (Array#bsearch) - attempt #2 #461

Open radarek opened 1 year ago

radarek commented 1 year ago

This is revised attempt of making KeyValidator class faster when checking for existence of given key. Previous one was broken.

The problem with previous version was that it tried to search for all 3 possibilities (key, key + "[]" and key + ".") at once. Consider this schema:

required(:a).filled(:string)
required(:fooA).filled(:string)
required(:foo).array(:hash) do
  required(:bar).filled(:string)
end

Now we want to validate it with a hash:

{ a: "string", fooA: "string", foo: "string" }

Previous version would start with comparing foo key with fooA key (because it is in the middle of sorted paths ["a", "fooA", "foo[].bar"]). It does not match, so what we should do? Should we go to the left or right? It depends on schema. When foo is a scalar then foo key would be before fooA in the sorted paths, so we should go left. For foo being an array we should go right (foo[] is after fooA).

Simply saying: it wasn't possible to do that in a single pass.

New algorithm searches for foo (exact match), foo. (prefix match) and foo[] (prefix match) separately. Maximum 3 searches, each O(N*logN) complexity which is still O(N*logN) in total. I added tests which does not pass with previous approach and does pass with new one.

Hope this time I didn't miss anything. And sorry for the delay :).