Adzz / data_schema

Declarative schemas for data transformations.
Apache License 2.0
85 stars 9 forks source link

Saxy #22

Closed Adzz closed 2 years ago

Adzz commented 2 years ago

Experimental Saxy handler that takes a schema and creates a simple_form DOM consisting of only the fields that exist in the schema.

The schema that it expects is in a different shape to the usual data_schemas so we still need to add a translation function.

This translation could either happen at compile time, or runtime or once at runtime and then cached thereafter depending on specific perf characteristics you may desire.

Benchmark

On a large airline XML response:

Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.3.4

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 5 s
reduction time: 5 s
parallel: 5
inputs: none specified
Estimated total run time: 51 s

Benchmarking simple_form with querying ...
Benchmarking slimmed simple form ...
Benchmarking xmerl with querying ...

Name                                ips        average  deviation         median         99th %
slimmed simple form                1.37      731.84 ms    ±12.10%      707.07 ms      984.40 ms
simple_form with querying          1.05      952.91 ms     ±3.30%      951.91 ms     1020.86 ms
xmerl with querying              0.0903    11076.27 ms     ±0.89%    11060.48 ms    11230.07 ms

Comparison:
slimmed simple form                1.37
simple_form with querying          1.05 - 1.30x slower +221.07 ms
xmerl with querying              0.0903 - 15.13x slower +10344.43 ms

Memory usage statistics:

Name                         Memory usage
slimmed simple form             193.24 MB
simple_form with querying       228.90 MB - 1.18x memory usage +35.65 MB
xmerl with querying            2340.31 MB - 12.11x memory usage +2147.07 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                              average  deviation         median         99th %
slimmed simple form               20.84 M     ±0.45%        20.86 M        20.99 M
simple_form with querying         21.61 M     ±0.77%        21.66 M        21.86 M
xmerl with querying              606.21 M     ±0.07%       605.96 M       606.82 M

Comparison:
slimmed simple form               20.86 M
simple_form with querying         21.61 M - 1.04x reduction count +0.76 M
xmerl with querying              606.21 M - 29.08x reduction count +585.37 M