Closed MartinPotier closed 3 years ago
@MartinPotier isn't #19 exactly about that?
@MartinPotier @qrilka could you try out this branch where I've merged @unhammer 's patch, and report before/after figures if you can ? https://github.com/ocramz/xeno/tree/whitespace-around-equals-%2319
@MartinPotier isn't #19 exactly about that?
It is! Sorry I didn't catch this
@MartinPotier @qrilka could you try out this branch where I've merged @unhammer 's patch, and report before/after figures if you can ?
I'll try to find a way to use it, I'm not too familiar with the procedure (and still quite new to haskell tooling)
@MartinPotier you can just run stack bench
while on master and then after switching to the feature branch :)
Master:
Running 2 benchmarks...
Benchmark xeno-memory-bench: RUNNING...
Case Allocated GCs
4kb/hexml/dom 3,808 0
4kb/xeno/sax -,496 0
4kb/xeno/dom 7,968 0
31kb/hexml/dom 30,608 0
31kb/xeno/sax -,928 0
31kb/xeno/dom 7,536 0
211kb/hexml/dom 211,496 0
211kb/xeno/sax 26,576 0
211kb/xeno/dom 1,043,328 0
Benchmark xeno-memory-bench: FINISH
Benchmark xeno-speed-bench: RUNNING...
benchmarking 4KB/hexml-dom
time 12.68 μs (11.95 μs .. 13.86 μs)
0.926 R² (0.883 R² .. 0.965 R²)
mean 14.69 μs (13.34 μs .. 17.05 μs)
std dev 6.183 μs (4.201 μs .. 9.620 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 4KB/xeno-sax
time 4.830 μs (4.649 μs .. 5.099 μs)
0.969 R² (0.950 R² .. 0.987 R²)
mean 5.120 μs (4.869 μs .. 5.420 μs)
std dev 939.5 ns (732.0 ns .. 1.170 μs)
variance introduced by outliers: 96% (severely inflated)
benchmarking 4KB/xeno-dom
time 12.92 μs (12.06 μs .. 14.36 μs)
0.860 R² (0.720 R² .. 0.969 R²)
mean 16.37 μs (13.88 μs .. 23.57 μs)
std dev 13.17 μs (5.986 μs .. 23.14 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 4KB/hexpat-sax
time 110.8 μs (98.79 μs .. 128.9 μs)
0.880 R² (0.838 R² .. 0.927 R²)
mean 146.0 μs (109.8 μs .. 228.5 μs)
std dev 159.1 μs (31.92 μs .. 272.5 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 4KB/hexpat-dom
time 382.4 μs (298.4 μs .. 477.4 μs)
0.763 R² (0.724 R² .. 0.932 R²)
mean 353.0 μs (318.9 μs .. 408.0 μs)
std dev 130.9 μs (95.56 μs .. 183.2 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 4KB/xml-dom
time 2.559 ms (1.783 ms .. 3.597 ms)
0.544 R² (0.364 R² .. 0.728 R²)
mean 3.529 ms (3.060 ms .. 4.191 ms)
std dev 1.702 ms (1.365 ms .. 2.075 ms)
variance introduced by outliers: 98% (severely inflated)
benchmarking 31KB/hexml-dom
time 14.24 μs (11.43 μs .. 17.59 μs)
0.668 R² (0.527 R² .. 0.822 R²)
mean 16.88 μs (13.83 μs .. 22.88 μs)
std dev 12.56 μs (7.129 μs .. 21.33 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 31KB/xeno-sax
time 2.018 μs (1.908 μs .. 2.269 μs)
0.960 R² (0.913 R² .. 0.999 R²)
mean 1.940 μs (1.895 μs .. 2.095 μs)
std dev 255.8 ns (55.95 ns .. 531.4 ns)
variance introduced by outliers: 93% (severely inflated)
benchmarking 31KB/xeno-dom
time 5.479 μs (5.120 μs .. 5.902 μs)
0.971 R² (0.956 R² .. 0.992 R²)
mean 5.160 μs (4.993 μs .. 5.429 μs)
std dev 661.4 ns (399.2 ns .. 985.0 ns)
variance introduced by outliers: 92% (severely inflated)
benchmarking 31KB/hexpat-sax
time 248.8 μs (243.4 μs .. 254.9 μs)
0.994 R² (0.989 R² .. 0.997 R²)
mean 252.6 μs (243.2 μs .. 279.8 μs)
std dev 56.43 μs (12.19 μs .. 107.6 μs)
variance introduced by outliers: 96% (severely inflated)
benchmarking 31KB/hexpat-dom
time 293.7 μs (268.3 μs .. 336.0 μs)
0.912 R² (0.840 R² .. 0.997 R²)
mean 273.4 μs (264.2 μs .. 295.9 μs)
std dev 48.93 μs (13.13 μs .. 92.63 μs)
variance introduced by outliers: 92% (severely inflated)
benchmarking 31KB/xml-dom
time 12.43 ms (11.61 ms .. 13.55 ms)
0.969 R² (0.947 R² .. 0.992 R²)
mean 11.99 ms (11.68 ms .. 12.52 ms)
std dev 1.050 ms (796.8 μs .. 1.478 ms)
variance introduced by outliers: 46% (moderately inflated)
benchmarking 211KB/hexml-dom
time 255.4 μs (240.2 μs .. 280.3 μs)
0.980 R² (0.964 R² .. 0.998 R²)
mean 248.1 μs (242.9 μs .. 259.4 μs)
std dev 23.53 μs (12.04 μs .. 38.22 μs)
variance introduced by outliers: 77% (severely inflated)
benchmarking 211KB/xeno-sax
time 255.1 μs (233.4 μs .. 276.1 μs)
0.971 R² (0.959 R² .. 0.989 R²)
mean 230.3 μs (222.5 μs .. 240.0 μs)
std dev 29.79 μs (22.70 μs .. 40.46 μs)
variance introduced by outliers: 86% (severely inflated)
benchmarking 211KB/xeno-dom
time 783.7 μs (656.5 μs .. 949.1 μs)
0.787 R² (0.728 R² .. 0.895 R²)
mean 888.4 μs (809.2 μs .. 1.019 ms)
std dev 315.6 μs (249.4 μs .. 397.7 μs)
variance introduced by outliers: 98% (severely inflated)
benchmarking 211KB/hexpat-sax
time 18.44 ms (14.61 ms .. 22.18 ms)
0.875 R² (0.820 R² .. 0.969 R²)
mean 23.69 ms (21.02 ms .. 27.38 ms)
std dev 7.008 ms (4.904 ms .. 9.546 ms)
variance introduced by outliers: 89% (severely inflated)
benchmarking 211KB/hexpat-dom
time 20.10 ms (17.66 ms .. 22.25 ms)
0.923 R² (0.789 R² .. 0.983 R²)
mean 24.17 ms (22.73 ms .. 26.57 ms)
std dev 3.973 ms (2.581 ms .. 5.958 ms)
variance introduced by outliers: 69% (severely inflated)
benchmarking 211KB/xml-dom
time 84.14 ms (53.48 ms .. 95.22 ms)
0.903 R² (0.682 R² .. 0.999 R²)
mean 105.2 ms (94.69 ms .. 136.5 ms)
std dev 27.07 ms (1.725 ms .. 40.24 ms)
variance introduced by outliers: 76% (severely inflated)
Benchmark xeno-speed-bench: FINISH
Completed 73 action(s).
Whitespace branch:
Running 2 benchmarks...
Benchmark xeno-memory-bench: RUNNING...
Case Allocated GCs
4kb/hexml/dom 3,808 0
4kb/xeno/sax -,496 0
4kb/xeno/dom 7,968 0
31kb/hexml/dom 30,608 0
31kb/xeno/sax -,928 0
31kb/xeno/dom 6,056 0
211kb/hexml/dom 211,496 0
211kb/xeno/sax 26,576 0
211kb/xeno/dom 1,043,328 0
Benchmark xeno-memory-bench: FINISH
Benchmark xeno-speed-bench: RUNNING...
benchmarking 4KB/hexml-dom
time 9.143 μs (8.837 μs .. 9.459 μs)
0.992 R² (0.986 R² .. 0.997 R²)
mean 8.893 μs (8.705 μs .. 9.255 μs)
std dev 884.4 ns (588.1 ns .. 1.378 μs)
variance introduced by outliers: 86% (severely inflated)
benchmarking 4KB/xeno-sax
time 4.120 μs (4.075 μs .. 4.159 μs)
0.998 R² (0.996 R² .. 0.999 R²)
mean 4.058 μs (4.006 μs .. 4.161 μs)
std dev 262.7 ns (137.0 ns .. 486.1 ns)
variance introduced by outliers: 74% (severely inflated)
benchmarking 4KB/xeno-dom
time 9.651 μs (9.376 μs .. 9.983 μs)
0.992 R² (0.986 R² .. 0.998 R²)
mean 9.909 μs (9.686 μs .. 10.19 μs)
std dev 872.1 ns (708.7 ns .. 1.251 μs)
variance introduced by outliers: 83% (severely inflated)
benchmarking 4KB/hexpat-sax
time 71.90 μs (68.60 μs .. 76.49 μs)
0.975 R² (0.954 R² .. 0.991 R²)
mean 73.57 μs (70.20 μs .. 80.64 μs)
std dev 14.43 μs (6.847 μs .. 25.25 μs)
variance introduced by outliers: 95% (severely inflated)
benchmarking 4KB/hexpat-dom
time 207.4 μs (201.9 μs .. 213.8 μs)
0.994 R² (0.991 R² .. 0.997 R²)
mean 205.7 μs (202.7 μs .. 209.4 μs)
std dev 11.86 μs (9.518 μs .. 15.88 μs)
variance introduced by outliers: 56% (severely inflated)
benchmarking 4KB/xml-dom
time 2.425 ms (2.360 ms .. 2.526 ms)
0.970 R² (0.948 R² .. 0.986 R²)
mean 2.236 ms (2.116 ms .. 2.379 ms)
std dev 439.1 μs (343.7 μs .. 560.7 μs)
variance introduced by outliers: 90% (severely inflated)
benchmarking 31KB/hexml-dom
time 13.20 μs (12.65 μs .. 13.89 μs)
0.974 R² (0.960 R² .. 0.988 R²)
mean 14.45 μs (13.48 μs .. 16.64 μs)
std dev 4.307 μs (2.414 μs .. 8.058 μs)
variance introduced by outliers: 99% (severely inflated)
benchmarking 31KB/xeno-sax
time 2.178 μs (2.154 μs .. 2.209 μs)
0.998 R² (0.997 R² .. 0.999 R²)
mean 2.183 μs (2.156 μs .. 2.245 μs)
std dev 128.6 ns (66.64 ns .. 236.9 ns)
variance introduced by outliers: 71% (severely inflated)
benchmarking 31KB/xeno-dom
time 6.806 μs (6.556 μs .. 7.070 μs)
0.987 R² (0.976 R² .. 0.995 R²)
mean 6.711 μs (6.502 μs .. 7.050 μs)
std dev 811.9 ns (542.0 ns .. 1.216 μs)
variance introduced by outliers: 91% (severely inflated)
benchmarking 31KB/hexpat-sax
time 322.0 μs (296.6 μs .. 356.0 μs)
0.942 R² (0.907 R² .. 0.974 R²)
mean 333.2 μs (317.0 μs .. 355.2 μs)
std dev 58.86 μs (43.57 μs .. 86.36 μs)
variance introduced by outliers: 92% (severely inflated)
benchmarking 31KB/hexpat-dom
time 374.8 μs (355.3 μs .. 395.7 μs)
0.979 R² (0.968 R² .. 0.991 R²)
mean 394.8 μs (375.7 μs .. 436.9 μs)
std dev 96.14 μs (45.08 μs .. 177.7 μs)
variance introduced by outliers: 95% (severely inflated)
benchmarking 31KB/xml-dom
time 15.52 ms (14.78 ms .. 16.26 ms)
0.987 R² (0.975 R² .. 0.994 R²)
mean 16.48 ms (15.92 ms .. 17.36 ms)
std dev 1.690 ms (1.128 ms .. 2.594 ms)
variance introduced by outliers: 48% (moderately inflated)
benchmarking 211KB/hexml-dom
time 379.2 μs (359.4 μs .. 401.2 μs)
0.974 R² (0.960 R² .. 0.986 R²)
mean 378.3 μs (358.4 μs .. 417.5 μs)
std dev 90.25 μs (57.77 μs .. 180.8 μs)
variance introduced by outliers: 95% (severely inflated)
benchmarking 211KB/xeno-sax
time 226.8 μs (222.5 μs .. 232.5 μs)
0.995 R² (0.993 R² .. 0.998 R²)
mean 234.6 μs (230.7 μs .. 240.0 μs)
std dev 16.45 μs (13.84 μs .. 19.01 μs)
variance introduced by outliers: 64% (severely inflated)
benchmarking 211KB/xeno-dom
time 703.2 μs (679.9 μs .. 735.4 μs)
0.958 R² (0.929 R² .. 0.980 R²)
mean 798.2 μs (748.4 μs .. 897.1 μs)
std dev 227.5 μs (144.9 μs .. 436.6 μs)
variance introduced by outliers: 97% (severely inflated)
benchmarking 211KB/hexpat-sax
time 28.17 ms (25.24 ms .. 32.59 ms)
0.937 R² (0.873 R² .. 0.989 R²)
mean 27.98 ms (26.05 ms .. 30.60 ms)
std dev 4.626 ms (3.002 ms .. 6.750 ms)
variance introduced by outliers: 65% (severely inflated)
benchmarking 211KB/hexpat-dom
time 41.27 ms (31.05 ms .. 51.55 ms)
0.873 R² (0.816 R² .. 0.980 R²)
mean 32.56 ms (30.40 ms .. 36.64 ms)
std dev 6.351 ms (3.747 ms .. 9.864 ms)
variance introduced by outliers: 74% (severely inflated)
benchmarking 211KB/xml-dom
time 91.71 ms (4.966 ms .. 161.3 ms)
0.749 R² (0.171 R² .. 0.986 R²)
mean 134.8 ms (113.6 ms .. 158.8 ms)
std dev 33.58 ms (23.60 ms .. 46.92 ms)
variance introduced by outliers: 73% (severely inflated)
Benchmark xeno-speed-bench: FINISH
Completed 2 action(s).
Please note that I'm running this on a laptop with an Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
. Not really fast, and almost always doing other things on the side.
@MartinPotier yes, there are lots of outlying measurements, it would be good to run these without many "noisy neighbours" (i.e. not many other processes competing for CPU)
This is my work laptop and it's difficult to make it quiet :smile_cat: I'll run this at home on a beefier machine, maybe that'll be better.
Hmmm, unfortunately, at home I can't run the bench:
Running 2 benchmarks...
Benchmark xeno-memory-bench: RUNNING...
Case Allocated GCs
4kb/hexml/dom 4,120 0
4kb/xeno/sax -,496 0
4kb/xeno/dom 7,968 0
31kb/hexml/dom 26,272 0
31kb/xeno/sax -1,240 0
31kb/xeno/dom 7,464 0
211kb/hexml/dom 211,496 0
211kb/xeno/sax 26,504 0
211kb/xeno/dom 1,043,016 0
Benchmark xeno-memory-bench: FINISH
Benchmark xeno-speed-bench: RUNNING...
benchmarking 4KB/hexml-dom
xeno-speed-bench: <stdout>: commitBuffer: invalid argument (invalid character)
time 7.325 Benchmark xeno-speed-bench: ERROR
Looks like a locale problem, my locale is fine:
$ locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Pretty much the same system than at work, except that stack is more recent here:
Version 1.7.1 x86_64
Compiled with:
- Cabal-2.2.0.1
- Glob-0.9.2
- HUnit-1.6.0.0
- QuickCheck-2.10.1
- StateVar-1.1.1.0
- aeson-1.2.4.0
- aeson-compat-0.3.8
- annotated-wl-pprint-0.7.0
- ansi-terminal-0.8.0.4
- ansi-wl-pprint-0.6.8.2
- array-0.5.2.0
- asn1-encoding-0.9.5
- asn1-parse-0.9.4
- asn1-types-0.3.2
- async-2.1.1.1
- attoparsec-0.13.2.2
- attoparsec-iso8601-1.0.0.0
- auto-update-0.1.4
- base-4.10.1.0
- base-compat-0.9.3
- base-orphans-0.7
- base-prelude-1.2.1
- base16-bytestring-0.1.1.6
- base64-bytestring-1.0.0.1
- basement-0.0.7
- bifunctors-5.5.2
- binary-0.8.5.1
- bindings-uname-0.1
- bitarray-0.0.1.1
- blaze-builder-0.4.1.0
- blaze-html-0.9.1.1
- blaze-markup-0.8.2.1
- byteable-0.1.1
- bytestring-0.10.8.2
- call-stack-0.1.0
- case-insensitive-1.2.0.11
- cereal-0.5.5.0
- clock-0.7.2
- colour-2.3.4
- comonad-5.0.3
- conduit-1.3.0.3
- conduit-extra-1.3.0
- connection-0.2.8
- containers-0.5.10.2
- contravariant-1.4.1
- cookie-0.4.4
- cpphs-1.20.8
- cryptohash-0.11.9
- cryptohash-sha256-0.11.101.0
- cryptonite-0.25
- cryptonite-conduit-0.2.2
- data-default-class-0.1.2.0
- deepseq-1.4.3.0
- digest-0.0.1.2
- directory-1.3.0.2
- distributive-0.5.3
- dlist-0.8.0.4
- easy-file-0.2.2
- echo-0.1.3
- ed25519-0.0.5.0
- either-5
- exceptions-0.8.3
- extra-1.6.8
- fast-logger-2.4.11
- file-embed-0.0.10.1
- filelock-0.1.1.2
- filepath-1.4.1.2
- foundation-0.0.20
- free-5.0.2
- fsnotify-0.2.1.1
- generic-deriving-1.12.1
- ghc-boot-th-8.2.2
- ghc-prim-0.5.1.1
- gitrev-1.3.1
- hackage-security-0.5.3.0
- hashable-1.2.7.0
- haskell-src-exts-1.20.2
- haskell-src-meta-0.8.0.3
- hinotify-0.3.9
- hourglass-0.2.11
- hpack-0.28.2
- hpc-0.6.0.3
- hspec-2.4.8
- hspec-core-2.4.8
- hspec-discover-2.4.8
- hspec-expectations-0.8.2
- hspec-smallcheck-0.5.0
- http-api-data-0.3.7.2
- http-client-0.5.13
- http-client-tls-0.3.5.3
- http-conduit-2.3.1
- http-types-0.12.1
- integer-gmp-1.0.1.0
- integer-logarithms-1.0.2.1
- lifted-base-0.2.3.12
- logict-0.6.0.2
- memory-0.14.16
- microlens-0.4.8.3
- microlens-th-0.4.1.3
- mime-types-0.1.0.7
- mintty-0.1.2
- monad-control-1.0.2.3
- monad-logger-0.3.28.5
- monad-loops-0.4.3
- mono-traversable-1.0.8.1
- mtl-2.2.2
- mustache-2.3.0
- neat-interpolation-0.3.2.1
- network-2.6.3.5
- network-uri-2.6.1.0
- old-locale-1.0.0.7
- old-time-1.1.0.3
- open-browser-0.2.1.0
- optparse-applicative-0.14.2.0
- optparse-simple-0.1.0
- parsec-3.1.13.0
- path-0.6.1
- path-io-1.3.3
- path-pieces-0.2.1
- pem-0.2.4
- persistent-2.8.2
- persistent-sqlite-2.8.1.2
- persistent-template-2.5.4
- polyparse-1.12
- pretty-1.1.3.3
- primitive-0.6.4.0
- process-1.6.1.0
- profunctors-5.2.2
- project-template-0.2.0.1
- quickcheck-io-0.2.0
- random-1.1
- regex-applicative-0.3.3
- regex-applicative-text-0.1.0.1
- resource-pool-0.2.3.2
- resourcet-1.2.1
- retry-0.7.6.2
- rio-0.1.3.0
- rts-1.0
- safe-0.3.17
- scientific-0.3.6.2
- semigroupoids-5.2.2
- semigroups-0.18.4
- setenv-0.1.1.3
- silently-1.2.5
- smallcheck-1.1.4
- socks-0.5.6
- split-0.2.3.3
- stm-2.4.5.0
- stm-chans-3.0.0.4
- store-0.4.3.2
- store-core-0.4.4
- streaming-commons-0.1.19
- syb-0.7
- tagged-0.8.5
- tar-0.5.1.0
- template-haskell-2.12.0.0
- temporary-1.2.1.1
- text-1.2.3.0
- text-metrics-0.3.0
- tf-random-0.5
- th-abstraction-0.2.7.0
- th-expand-syns-0.4.4.0
- th-lift-0.7.10
- th-lift-instances-0.1.11
- th-orphans-0.13.5
- th-reify-many-0.1.8
- th-utilities-0.2.0.1
- time-1.8.0.2
- time-locale-compat-0.1.1.4
- tls-1.4.1
- transformers-0.5.2.0
- transformers-base-0.4.4
- transformers-compat-0.5.1.4
- typed-process-0.2.2.0
- unicode-transforms-0.3.4
- unix-2.7.2.2
- unix-compat-0.5.0.1
- unix-time-0.3.8
- unliftio-0.2.7.0
- unliftio-core-0.1.1.0
- unordered-containers-0.2.9.0
- uri-bytestring-0.3.2.0
- uuid-types-1.0.3
- vector-0.12.0.1
- vector-algorithms-0.7.0.1
- void-0.7.2
- x509-1.7.3
- x509-store-1.6.6
- x509-system-1.6.6
- x509-validation-1.6.10
- yaml-0.8.30
- zip-archive-0.3.2.5
- zlib-0.6.2
Warning: this is an unsupported build that may use different versions of
dependencies and GHC than the officially released binaries, and therefore may
not behave identically. If you encounter problems, please try the latest
official build by running 'stack upgrade --force-download'.
@ocramz on my machine I see some increase in DOM parsing almost 5% - https://gist.github.com/qrilka/d36464cb52499bf1041b1bd7c0dd341d/revisions (the new results are the ones from the branch)
This should be solved by #49 and #51, please reopen if that's not the case.
Seems like validation fails if I have something of the sort:
AFAIK, they are allowed in this Specification https://www.w3.org/TR/2006/REC-xml11-20060816/#sec-white-space
Can I do anything to make these accepted?
This is in xeno-0.3.2 included in the LTS-10.4