Stranger6667 / jsonschema-rs

JSON Schema validation library
https://docs.rs/jsonschema
MIT License
504 stars 90 forks source link

Use a better structure instead of `Map` in validators #91

Closed Stranger6667 closed 3 years ago

Stranger6667 commented 4 years ago

E.g. in AdditionalPropertiesNotEmptyValidator. Because during the validation we use self.properties.contains_key(*property) which will be O(logN) (because of BTreeMap). We can use HashMap to have O(1).

As perf showed, significant time is spent on it

Stranger6667 commented 4 years ago

The first tests showed a significant improvement in compilation but a slowdown in validation. But with FNV hasher it seems faster. Maybe we can avoid contains call at all, e.g. if we will validate over properties

Stranger6667 commented 4 years ago

Hm, for a small number of properties a simple Vec will be much faster. Actually I think that it will work for most of the cases, I doubt that there are usually many properties

macisamuele commented 4 years ago

@Stranger6667 I would be curious to get to know more how you're actually deriving these conclusions (out of genuine interest). Could you please post here (or in a gist) the raw data such that I can eventually look at them? As you've noticed I'm trying to support as much as I can on this project and being able to run similar checks on my side I can avoid to publish PRs for which I know that there is an hit.

macisamuele commented 4 years ago

Side note: checking the code specifically I see that we do only access to self.properties keys and never the values. I think that an "easy" win (at least from memory prospective) might already be to save only the keys and eventually having having a more optimised set representation for small number of entries, maybe like

enum StringSet {
    None,
    Single(String),
    Small(Vec<String>),  // up to 8 values (assuming that micro benching confirms this threshold)
    Large(HashSet<String>),
}
Stranger6667 commented 4 years ago

Sure!

Some time ago I tried flame graphs to get a breakdown of what is being executed in the binary. My usual workflow:

  1. I create main.rs with something I'd like to test:
use jsonschema::JSONSchema;
use serde_json::json;

fn main() {
    let schema = json!(...);
    let instance = json!(...);
    let compiled = JSONSchema::compile(&schema, None).unwrap();
    for _ in 0..10000000 {
        compiled.is_valid(&instance);
    }
}
  1. Add debug to the release profile in Cargo.toml
[profile.release]
debug = true
  1. Run perf record to record samples (path to binary is relative to the crate's root)
perf record --call-graph dwarf,16384 -F 999 -g  target/release/jsonschema

Then there will be perf.data file in the current folder

  1. Convert it to a different format for better visualization

perf script | stackcollapse-perf.pl | ./rust-unmangle > output

And rust-unmangle is:

#!/usr/bin/sed -rf
# Unmangle Rust symbols
# See https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=cae15db74999edb96dd9f5bbd4d55849391dd92b
# Example, with [FlameGraph](https://github.com/brendangregg/FlameGraph):
#     perf record -g target/debug/bin
#     perf script | stackcollapse-perf | rust-unmangle | flamegraph > perf.svg

# Remove hash and address offset
s/::h[0-9a-f]{16}//g
s/\+0x[0-9a-f]+//g

# Convert special characters
s/\$C\$/,/g
s/\$SP\$/@/g
s/\$BP\$/*/g
s/\$RF\$/\&/g
s/\$LT\$/</g
s/\$GT\$/>/g
s/\$LP\$/(/g
s/\$RP\$/)/g
s/\$u20\$/ /g
s/\$u27\$/'/g
s/\$u5b\$/[/g
s/\$u5d\$/]/g
s/\$u7b\$/{/g
s/\$u7d\$/}/g
s/\$u7e\$/~/g

# Fix . and _
s/\.\./::/g
s/[^\.]\.[^\.]/./g
s/([;:])_/\1/g
  1. I open this file in https://www.speedscope.app/

And there is a nice representation which is easy to navigate. Basically from that, I found that significant time is spent in contains_key, then it goes to the BTreeMap complexity for this action. Then there are some better options than O(logN) for data access, etc

Could you please post here (or in a gist) the raw data such that I can eventually look at them?

I will do it

As you've noticed I'm trying to support as much as I can on this project and being able to run similar checks on my side I can avoid to publish PRs for which I know that there is an hit.

I want to say that I appreciate your work very much!

a more optimised set representation for small number of entries, maybe like

Yep, it might work! But wouldn't it be more efficient to use a generic here to avoid one indirection level? Not sure how to do it, but if our validator can accept anything that does contains ...

macisamuele commented 4 years ago

@Stranger6667 FYI There is a cargo extension that helps you having a more direct access to the flamegraph. You'll just need to run

cargo flamegraph [--output flamegraph.svg]

I'm considering to create a PR where I add a new directory (perf-helpers) which gives you a simple access to a pre-defined rust-executable which loads at compile time schema and instance from files. Doing so would easily allow to always run the same command letting you obtaining the results that you're looking for

Stranger6667 commented 4 years ago

Thank you for bringing this! Unfortunately, those SVGs were not working properly on my machine (they were not interactive), so I found an alternative. I'll take a deeper look, most probably it is some local misconfiguration / missing package issue

macisamuele commented 4 years ago

Unfortunately, those SVGs were not working properly

That's sad ... I hope that you identify a simple tool ;) The reason for using cargo flamegraph is mostly to have some sort of platform independence (I'm usually working on a MacOS and dtrace is the tool to use there). But anything that might make streamlined processes would be worth to consider (as we might have PR templates that recommend a set of steps before publishing)